# Riksdag Thesis - Swedish Parliament Analysis Project

## Overview
This project analyzes Swedish Parliament (Riksdag) speeches, debates, and political discourse.

**Repository:** https://github.com/Umar-Shahid/Data-Analysis
**Datasets:** https://kaggle.com/datasets/muhammadumarshahid/riksdag-speeches-processed-csvs

## Project Structure
- `scripts/` - Python analysis and processing scripts
- `notebooks/` - Jupyter notebooks for exploration and analysis
- `data/` - Data files and processing pipelines
- `output/` - Analysis results, figures, and tables

## Key Scripts
1. `01_explore_riksdag.py` - Initial data exploration
2. `02_download_metadata.py` - Download Riksdag debate metadata
3. `03_download_transcripts.py` - Download speech transcripts
4. `04_parse_speeches.py` - Parse and clean transcript data
5. `06_opponent_references.py` - Identify and analyze opponent references

## Getting Started
```bash
pip install -r requirements.txt
python scripts/01_explore_riksdag.py
```

## Data
Large processed datasets available on Kaggle:
- `all_speeches.csv` (116.8 MB)
- `speeches_with_opponents.csv` (117.2 MB)


In [None]:
# Import key libraries
import pandas as pd
import numpy as np
import os
from pathlib import Path

# Set up paths
project_root = Path('../input/riksdag-speeches-processed-csvs')

# Check available files
print('Available data files:')
for f in project_root.glob('*.csv'):
    size = f.stat().st_size / (1024*1024)
    print(f'  - {f.name} ({size:.1f} MB)')
