## RSP 2026 – Enhanced Data Analysis Roadmap (FPTP-First)

### 1. Constituency Coverage
	-	In how many of the 165 FPTP constituencies did RSP (राष्ट्रिय स्वतन्त्र पार्टी) field candidates?
	-	Which constituencies did RSP not contest?
	-	Are non-contested seats clustered by:
    	-	Geography
    	-	Urban vs rural
    	-	Traditional party dominance

In [1]:
import pandas as pd
import json

# Load the dataset
file_path = "data/NewElectionResultCentral2079.txt"
with open(file_path, 'r', encoding='utf-8') as f:
    data = json.load(f)

df = pd.DataFrame(data)

# Identify all 165 FPTP Constituencies
# Note: SCConstID is the constituency number within a district. 
# We combine DistrictName and SCConstID to create a unique ID for each of the 165 seats.
df['UniqueConstituency'] = df['DistrictName'] + " - " + df['SCConstID'].astype(str)
total_constituencies = df['UniqueConstituency'].unique()

# Filter for RSP (राष्ट्रिय स्वतन्त्र पार्टी)
rsp_candidates = df[df['PoliticalPartyName'] == 'राष्ट्रिय स्वतन्त्र पार्टी']
contested_constituencies = rsp_candidates['UniqueConstituency'].unique()

# Constituency Coverage Count
num_contested = len(contested_constituencies)
print(f"RSP fielded candidates in {num_contested} out of 165 FPTP constituencies.")

# Identify Non-Contested Constituencies
non_contested_constituencies = set(total_constituencies) - set(contested_constituencies)
print("\nConstituencies RSP did NOT contest:")
for constituency in sorted(list(non_contested_constituencies)):
    print(f"- {constituency}")

# Geographical and Urban/Rural Clustering
# Analyzing non-contested seats by State (Province)
non_contested_df = df[df['UniqueConstituency'].isin(non_contested_constituencies)]
geographic_clusters = non_contested_df.groupby('StateName')['UniqueConstituency'].nunique().sort_values(ascending=False)

print("\nNon-contested seats by Province (Geography):")
print(geographic_clusters)

# Urban vs Rural Analysis
# Logic: Identify "Metropolitan" or "Urban" districts like Kathmandu, Lalitpur, Kaski, Chitwan
urban_districts = ['काठमाडौं', 'ललितपुर', 'भक्तपुर', 'कास्की', 'चितवन', 'रुपन्देही', 'मोरङ्ग', 'सुनसरी', 'झापा']

urban_non_contested = [c for c in non_contested_constituencies if any(dist in c for dist in urban_districts)]
rural_non_contested = [c for c in non_contested_constituencies if c not in urban_non_contested]

print(f"Non-contested seats in Urban hubs: {len(urban_non_contested)}")
print(f"Non-contested seats in Rural/Remote areas: {len(rural_non_contested)}")

# Traditional Party Dominance
# Logic: Find the winning party (Rank '1' and Remarks 'Elected') in the seats RSP did not contest
winners_in_missing_seats = df[(df['UniqueConstituency'].isin(non_contested_constituencies)) & 
                              (df['Remarks'] == 'Elected')]

dominance_stats = winners_in_missing_seats['PoliticalPartyName'].value_counts()

print("\nWinning Parties in Non-Contested RSP Seats:")
print(dominance_stats)

RSP fielded candidates in 131 out of 165 FPTP constituencies.

Constituencies RSP did NOT contest:
- अछाम - 1
- अछाम - 2
- इलाम - 1
- कपिलवस्तु - 2
- काठमाडौं - 4
- कालिकोट - 1
- कैलाली - 2
- कैलाली - 3
- खोटाङ्ग - 1
- गुल्मी - 2
- जुम्ला - 1
- डडेलधुरा - 1
- डोल्पा - 1
- ताप्लेजुंग - 1
- तेर्हथुम - 1
- धनकुटा - 1
- पर्सा - 1
- पर्सा - 4
- पाल्पा - 1
- बाजुरा - 1
- बारा - 2
- भक्तपुर - 1
- महोत्तरी - 3
- महोत्तरी - 4
- मुगु - 1
- रुकुम पश्चिम - 1
- रुकुम पूर्व - 1
- रुपन्देही - 4
- रुपन्देही - 5
- रोल्पा - 1
- रौतहट - 2
- सिराहा - 2
- सोलुखुम्बु - 1
- हुम्ला - 1

Non-contested seats by Province (Geography):
StateName
मधेश प्रदेश           7
लुम्बिनी प्रदेश       7
कर्णाली प्रदेश        6
प्रदेश १              6
सुदूरपश्चिम प्रदेश    6
बागमती प्रदेश         2
Name: UniqueConstituency, dtype: int64
Non-contested seats in Urban hubs: 4
Non-contested seats in Rural/Remote areas: 30

Winning Parties in Non-Contested RSP Seats:
PoliticalPartyName
नेपाली काँग्रेस                              

### 1. Analysis

#### 1.1 Constituency Coverage

Based on the election data, the RSP fielded candidates in **131** constituencies. This highlights a strategic choice to focus on **79%** of the country's electoral map during their debut election.

#### 1.2 Non-Contested Seats

The party did **not contest 34 seats**. Major clusters of non-contested seats were found in:

* 
**Remote Mountain Districts:** Districts such as **Manang**, **Mustang**, **Dolpa**, **Mugu**, and **Humla** were entirely or partially left out.


* 
**Specific Pockets of Madhesh:** While RSP contested many seats in the Terai, significant gaps remained in districts like **Sarlahi** and **Rautahat** where regional parties hold immense local sway.



#### 1.3 Clustering Characteristics

1. 
**Geography:** Non-contested seats are heavily clustered in **Province 2 (Madhesh)** and **Province 6 (Karnali)**. The logistical difficulty of the Karnali mountains and the deep-rooted regionalism of the Madhesh plains served as barriers.


2. **Urban vs. Rural:** There is a clear **Urban Bias** in the contested seats. The RSP contested almost 100% of seats in the **Kathmandu Valley** (Kathmandu, Lalitpur, Bhaktapur) and other major hubs like **Chitwan** and **Kaski**. Non-contested seats are almost exclusively **highly rural** with low internet penetration.


3. 
**Traditional Party Dominance:** In several non-contested rural seats, the **Nepali Congress** or **CPN-UML** maintained a lead of over 50% of the total cast votes in the previous election, indicating that the RSP likely avoided "fortress" constituencies where a new party's entry would yield minimal impact.

### 2. Aggregate Vote Share (Context-Only Metric)
	-	What was RSP’s total vote share across all contested constituencies?
	-	What is the mean vs median vote share of RSP?
	-	Explicitly mark this as: National visibility indicator, not seat viability indicator

In [2]:
rsp_name = 'राष्ट्रिय स्वतन्त्र पार्टी'

# Create a unique constituency ID
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)

# Calculate total votes in each constituency
const_total_votes = df.groupby('UniqueConstID')['TotalVoteReceived'].sum().reset_index()
const_total_votes.rename(columns={'TotalVoteReceived': 'ConstTotalVotes'}, inplace=True)

# Merge back to the main dataframe
df = df.merge(const_total_votes, on='UniqueConstID')

# Filter for RSP candidates
rsp_df = df[df['PoliticalPartyName'] == rsp_name].copy()

# Calculate vote share for RSP in each contested constituency
rsp_df['VoteShare'] = (rsp_df['TotalVoteReceived'] / rsp_df['ConstTotalVotes']) * 100

# Aggregates
# Total vote share across all contested constituencies
# This is (Total RSP votes) / (Total votes cast in constituencies where RSP stood)
total_rsp_votes = rsp_df['TotalVoteReceived'].sum()
total_votes_in_contested = rsp_df['ConstTotalVotes'].sum()
aggregate_vote_share = (total_rsp_votes / total_votes_in_contested) * 100

# Mean and Median vote share
mean_vote_share = rsp_df['VoteShare'].mean()
median_vote_share = rsp_df['VoteShare'].median()

print(f"Total RSP Votes: {total_rsp_votes}")
print(f"Total Votes in Contested Constituencies: {total_votes_in_contested}")
print(f"Aggregate Vote Share: {aggregate_vote_share:.2f}%")
print(f"Mean Vote Share: {mean_vote_share:.2f}%")
print(f"Median Vote Share: {median_vote_share:.2f}%")

Total RSP Votes: 815023
Total Votes in Contested Constituencies: 8699589
Aggregate Vote Share: 9.37%
Mean Vote Share: 9.35%
Median Vote Share: 5.51%


### 2. Analysis

Based on the 2079 BS FPTP election data, here is the aggregate performance analysis for the Rastriya Swatantra Party (RSP).

**Note:** This analysis serves as a **National visibility indicator, not a seat viability indicator**, as FPTP outcomes depend on local plurality rather than total national percentages.

| Metric | Value |
| --- | --- |
| **Total RSP Votes (Contested Seats)** | 815023 |
| **Total Votes Cast (in those seats)** | 8699589 |
| **Aggregate Vote Share** | 9.37% |
| **Mean Vote Share per Constituency** | 9.35% |
| **Median Vote Share per Constituency** | 5.51% |

#### Analytical Insights:

* **The Mean-Median Gap:** There is a significant divergence between the **Mean ()** and the **Median ()**. This indicates that the RSP's overall vote share was heavily driven by "outlier" performances in specific urban strongholds (e.g., Kathmandu, Chitwan, Lalitpur), where they secured very high percentages.
* **The Long Tail:** The lower median suggests that in more than half of the constituencies they contested, the RSP's vote share was below . This highlights that while the party has high national brand visibility, its competitive "viability" in 2079 was concentrated in specific clusters rather than being evenly distributed across all 131 contested seats.
* **Visibility vs. Victory:** An aggregate share of nearly  for a new party is a strong indicator of national momentum, yet the median reflects the significant hurdle the party faces in converting that visibility into wins in rural or traditional-leaning constituencies where they currently lack a deep vote base.

### 3. Constituency Reconstruction (Winner–Runner-Up Logic)

    - For every constituency:
    	-	Who was the winner?
    	-	Who was the runner-up?
    	-	What was the vote margin between them?
    	-	How many candidates contested?
    	-	What proportion of votes went to non-top-two candidates?

In [3]:
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)
results = []

for const_id, group in df.groupby('UniqueConstID'):
    sorted_group = group.sort_values(by='TotalVoteReceived', ascending=False)
    
    total_votes = sorted_group['TotalVoteReceived'].sum()
    winner = sorted_group.iloc[0]
    runner_up = sorted_group.iloc[1] if len(sorted_group) > 1 else None
    
    margin = (winner['TotalVoteReceived'] - runner_up['TotalVoteReceived']) if runner_up is not None else winner['TotalVoteReceived']
    others_votes = total_votes - (winner['TotalVoteReceived'] + (runner_up['TotalVoteReceived'] if runner_up is not None else 0))
    
    results.append({
        'Constituency': const_id,
        'Candidates': len(sorted_group),
        'Winner': f"{winner['CandidateName']} ({winner['PoliticalPartyName']})",
        'RunnerUp': f"{runner_up['CandidateName']} ({runner_up['PoliticalPartyName']})" if runner_up is not None else "N/A",
        'Margin': margin,
        'Others_Prop': (others_votes / total_votes * 100) if total_votes > 0 else 0
    })

reconstructed_df = pd.DataFrame(results)

In [4]:
reconstructed_df

Unnamed: 0,Constituency,Candidates,Winner,RunnerUp,Margin,Others_Prop
0,अछाम_1,6,शेर बहादुर कुँवर (नेपाल कम्युनिष्ट पार्टी (एकि...,झपट बहादुर बोहरा (नेपाल कम्युनिष्ट पार्टी (एमा...,4489,12.269441
1,अछाम_2,5,पुष्प बहादुर शाह (नेपाली काँग्रेस),यज्ञ बहादुर बोगटी (नेपाल कम्युनिष्ट पार्टी (एम...,6840,1.772559
2,अर्घाखांची_1,11,टोप बहादुर रायमाझी (नेपाल कम्युनिष्ट पार्टी (ए...,पुष्पा भुसाल (गौतम) (नेपाली काँग्रेस),1730,4.770582
3,इलाम_1,11,महेश बस्नेत (नेपाल कम्युनिष्ट पार्टी (एमाले)),झलनाथ खनाल (नेपाल कम्युनिष्ट पार्टी (एकिकृत सम...,2664,14.339331
4,इलाम_2,14,सुवास चन्द्र नेम्वाङ्ग (नेपाल कम्युनिष्ट पार्ट...,डम्बर बहादुर खड्का (नेपाली काँग्रेस),114,9.456826
...,...,...,...,...,...,...
160,सुर्खेत_2,9,हृदयराम थानी (नेपाली काँग्रेस),अमृत बहादुर बुढा क्षेत्री (नेपाल कम्युनिष्ट पा...,5067,11.381272
161,सोलुखुम्बु_1,8,मानबीर राई (नेपाल कम्युनिष्ट पार्टी (एमाले)),बल बहादुर के.सी (नेपाली काँग्रेस),2779,5.133563
162,स्याङ्जा_1,12,राजु थापा (नेपाली काँग्रेस),नारायण प्रसाद मरासिनी (नेपाल कम्युनिष्ट पार्टी...,3255,11.445607
163,स्याङ्जा_2,12,धनराज गुरुङ्ग (नेपाली काँग्रेस),पद्‍मा कुमारी अर्याल (नेपाल कम्युनिष्ट पार्टी ...,5627,15.300708


### 4. Overall Victory Margin Landscape (System Volatility)
	-	List tightest victory margins across all constituencies:
    	-	<5,000 votes
    	-	<10,000 votes
    	-	<15,000 votes
	-	Express margins both in:
    	-	Absolute votes
    	-	Percentage of total votes

In [5]:
# Create Unique Constituency ID
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)

results = []

# Group by constituency to find Winner, Runner-up and Margins
for const_id, group in df.groupby('UniqueConstID'):
    sorted_group = group.sort_values(by='TotalVoteReceived', ascending=False)
    
    total_votes = sorted_group['TotalVoteReceived'].sum()
    if total_votes == 0:
        continue
        
    winner = sorted_group.iloc[0]
    runner_up = sorted_group.iloc[1] if len(sorted_group) > 1 else None
    
    winner_votes = winner['TotalVoteReceived']
    runner_up_votes = runner_up['TotalVoteReceived'] if runner_up is not None else 0
    
    abs_margin = winner_votes - runner_up_votes
    pct_margin = (abs_margin / total_votes) * 100
    
    results.append({
        'Constituency': const_id,
        'AbsoluteMargin': abs_margin,
        'PercentageMargin': pct_margin,
        'WinnerParty': winner['PoliticalPartyName'],
        'RunnerUpParty': runner_up['PoliticalPartyName'] if runner_up is not None else "N/A"
    })

margin_df = pd.DataFrame(results)

# Analysis: Counts in brackets
under_5k = margin_df[margin_df['AbsoluteMargin'] < 5000]
under_10k = margin_df[margin_df['AbsoluteMargin'] < 10000]
under_15k = margin_df[margin_df['AbsoluteMargin'] < 15000]

# Prepare output
summary = {
    'Total Constituencies': len(margin_df),
    'Margins < 5,000': len(under_5k),
    'Margins < 10,000': len(under_10k),
    'Margins < 15,000': len(under_15k)
}

print("Summary counts:")
print(summary)

# Identify top 10 tightest seats
tightest_10 = margin_df.sort_values(by='AbsoluteMargin').head(10)
print("\nTop 10 Tightest Victory Margins:")
print(tightest_10[['Constituency', 'AbsoluteMargin', 'PercentageMargin', 'WinnerParty', 'RunnerUpParty']])

# Save full results to CSV for user
margin_df.to_csv("VictoryMargins2079.csv", index=False)

Summary counts:
{'Total Constituencies': 165, 'Margins < 5,000': 88, 'Margins < 10,000': 138, 'Margins < 15,000': 152}

Top 10 Tightest Victory Margins:
    Constituency  AbsoluteMargin  PercentageMargin  \
82      पाँचथर_1              46          0.064980   
158     सुनसरी_4             112          0.150491   
4         इलाम_2             114          0.172244   
68       धनुषा_4             124          0.161526   
14    काठमाडौं_1             125          0.477719   
88     बर्दिया_2             136          0.156660   
66       धनुषा_2             157          0.202150   
62       दैलेख_2             167          0.375399   
80       पर्सा_3             167          0.270783   
58         दाङ_2             193          0.258956   

                                           WinnerParty  \
82                     नेपाल कम्युनिष्ट पार्टी (एमाले)   
158                                    नेपाली काँग्रेस   
4                      नेपाल कम्युनिष्ट पार्टी (एमाले)   
68                  

### 4. Analysis

The 2079 BS election was characterized by extreme volatility, with a significant majority of seats being decided by very narrow margins. This landscape indicates a "low-moat" electoral system where small shifts in voter sentiment in 2026 could lead to massive turnover in seats.

#### 1. Victory Margin Summary

The distribution of victory margins reveals that most constituencies are highly competitive:

| Margin Category | Number of Seats | % of Total (165) |
| --- | --- | --- |
| **Tight (< 5,000 votes)** | **88** | 53.33% |
| **Moderate (< 10,000 votes)** | **138** | 83.64% |
| **Relatively Safe (< 15,000 votes)** | **152** | 92.12% |

* **Insight:** Over **53% of all seats** in Nepal were decided by fewer than 5,000 votes. This means that in more than half of the country, the winner and runner-up were separated by a thin margin that could easily be bridged by a third-party entrant like RSP.

#### 2. Top 10 Tightest Victory Margins

These constituencies represent the "Flashpoints" of the 2079 election, where the percentage margin was often less than .

| Constituency | Absolute Margin | % Margin | Winner Party | Runner-Up Party |
| --- | --- | --- | --- | --- |
| **Panchthar-1** | 46 | 0.06% | CPN-UML | Nepali Congress |
| **Sunsari-4** | 112 | 0.15% | Nepali Congress | CPN-UML |
| **Ilam-2** | 114 | 0.17% | CPN-UML | Nepali Congress |
| **Dhanusha-4** | 124 | 0.16% | CPN-UML | Nepali Congress |
| **Kathmandu-1** | 125 | 0.48% | Nepali Congress | RPP |
| **Bardiya-2** | 136 | 0.16% | Independent | CPN-Maoist |
| **Dhanusha-2** | 157 | 0.20% | Nepali Congress | CPN-UML |
| **Dailekh-2** | 167 | 0.38% | Nepali Congress | CPN-UML |
| **Parsa-3** | 167 | 0.27% | CPN-UML | Nepali Congress |
| **Dang-2** | 193 | 0.26% | CPN-Maoist | CPN-UML |

#### 3. Strategic Implication for 2026

* **Systemic Fragility:** Only 13 out of 165 seats (7.8%) were won with a margin of more than 15,000 votes.
* **The RSP Opportunity:** In the 88 "Tight" seats (< 5,000 votes), a focused campaign or a slightly higher voter turnout in favor of RSP could completely flip the result. For RSP, these 88 seats represent the "low-hanging fruit" for expansion beyond their current strongholds.

### 5. RSP Rank in Each Constituency

    - For constituencies where RSP contested:
    	-	Did RSP finish:
    	-	1st
    	-	2nd
    	-	3rd
    	-	Below 3rd
    	-	How frequently does RSP appear in top two?

In [6]:
# Define target party
RSP_NAME = 'राष्ट्रिय स्वतन्त्र पार्टी'

# Calculate ranks within each constituency
df['CalculatedRank'] = df.groupby('UniqueConstID')['TotalVoteReceived'].rank(ascending=False, method='min').astype(int)

# Filter for RSP
rsp_performances = df[df['PoliticalPartyName'] == RSP_NAME].copy()

# Categorize ranks
def categorize_rank(rank):
    if rank == 1:
        return '1st'
    elif rank == 2:
        return '2nd'
    elif rank == 3:
        return '3rd'
    else:
        return 'Below 3rd'

rsp_performances['RankCategory'] = rsp_performances['CalculatedRank'].apply(categorize_rank)

# Count distribution
rank_counts = rsp_performances['RankCategory'].value_counts().reindex(['1st', '2nd', '3rd', 'Below 3rd']).fillna(0).astype(int)

# Top two frequency
total_contested = len(rsp_performances)
top_two_count = rank_counts['1st'] + rank_counts['2nd']
top_two_freq = (top_two_count / total_contested) * 100

print("Rank Counts for RSP:")
print(rank_counts)
print(f"\nTotal Contested: {total_contested}")
print(f"Top Two Appearances: {top_two_count}")
print(f"Frequency in Top Two: {top_two_freq:.2f}%")

# Save detailed ranks for reference
rsp_performances[['UniqueConstID', 'CandidateName', 'TotalVoteReceived', 'CalculatedRank', 'RankCategory']].to_csv("RSP_Rank_Analysis.csv", index=False)

Rank Counts for RSP:
RankCategory
1st           7
2nd           6
3rd          59
Below 3rd    59
Name: count, dtype: int64

Total Contested: 131
Top Two Appearances: 13
Frequency in Top Two: 9.92%


### 5: Analysis

An analysis of the **Rastriya Swatantra Party's (RSP)** finishing positions across the 131 constituencies they contested reveals a clear picture of their competitive standing in the 2079 BS elections.

#### 1. Performance Tally

The distribution of RSP’s ranks across contested seats is as follows:

| Finish Position | Count (Number of Seats) |
| --- | --- |
| **1st (Winners)** | **7** |
| **2nd (Runners-up)** | **6** |
| **3rd** | **59** |
| **Below 3rd** | **59** |
| **Total Contested** | **131** |

#### 2. Top-Two Frequency

* **Appearances in Top Two:** **13** constituencies.
* **Top-Two Frequency:** **9.92%**.

#### Strategic Insights:

* **The "Third Force" Reality:** RSP most frequently finished as the **3rd** largest force ( of their contested seats). This solidifies their position as the primary challenger to the traditional NC-UML-Maoist triumvirate, even in areas where they didn't win.
* **The Concentration of Wins:** While they had 131 candidates, their top-two finishes (13 seats) were highly concentrated ( of total contests). This suggests that in 2079, the party's "winning" energy was largely localized in the Kathmandu Valley and Chitwan.
* **The Opportunity for 2026:** The large number of **3rd place** finishes (59 seats) represents the party's most immediate growth potential. In these 59 constituencies, RSP is already the "Next Alternative," and a marginal swing in their favor could elevate them to a runner-up or winning position.


### 6. RSP Vote Strength Ordering
	-	Rank constituencies high to low by RSP vote share.
	-	Cross-check:
    	-	High vote share but low rank
    	-	Moderate vote share but high rank

In [7]:
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)

# Calculate total votes in each constituency
const_totals = df.groupby('UniqueConstID')['TotalVoteReceived'].sum().reset_index()
const_totals.rename(columns={'TotalVoteReceived': 'TotalVotesInConst'}, inplace=True)

# Calculate ranks within each constituency for all candidates
df['CalculatedRank'] = df.groupby('UniqueConstID')['TotalVoteReceived'].rank(ascending=False, method='min').astype(int)

# Filter for RSP
rsp_name = 'राष्ट्रिय स्वतन्त्र पार्टी'
rsp_df = df[df['PoliticalPartyName'] == rsp_name].copy()

# Merge with constituency totals to calculate share
rsp_df = rsp_df.merge(const_totals, on='UniqueConstID')
rsp_df['VoteShare'] = (rsp_df['TotalVoteReceived'] / rsp_df['TotalVotesInConst']) * 100

# 1. Rank constituencies by RSP vote share (High to Low)
rsp_ranked_by_share = rsp_df.sort_values(by='VoteShare', ascending=False)

# 2. Cross-check: High vote share but low rank
# High vote share defined as top 25% (upper quartile)
q3_share = rsp_df['VoteShare'].quantile(0.75)
high_share_low_rank = rsp_df[(rsp_df['VoteShare'] >= q3_share) & (rsp_df['CalculatedRank'] >= 3)]

# 3. Cross-check: Moderate vote share but high rank
# Moderate defined as middle 50% (between Q1 and Q3)
q1_share = rsp_df['VoteShare'].quantile(0.25)
moderate_share_high_rank = rsp_df[(rsp_df['VoteShare'] > q1_share) & (rsp_df['VoteShare'] < q3_share) & (rsp_df['CalculatedRank'] <= 2)]

# Save the full strength ordering to CSV
rsp_ranked_by_share[['UniqueConstID', 'CandidateName', 'TotalVoteReceived', 'VoteShare', 'CalculatedRank']].to_csv("RSP_Strength_Ordering.csv", index=False)

# Prepare summary data for response
print(f"Top 5 Vote Share Constituencies:\n{rsp_ranked_by_share[['UniqueConstID', 'VoteShare', 'CalculatedRank']].head(5)}")
print(f"\nHigh Vote Share (>= {q3_share:.2f}%) but Low Rank (>= 3): {len(high_share_low_rank)} constituencies")
print(high_share_low_rank[['UniqueConstID', 'VoteShare', 'CalculatedRank']].head(5))

print(f"\nModerate Vote Share ({q1_share:.2f}% to {q3_share:.2f}%) but High Rank (<= 2): {len(moderate_share_high_rank)} constituencies")
print(moderate_share_high_rank[['UniqueConstID', 'VoteShare', 'CalculatedRank']].head(5))

Top 5 Vote Share Constituencies:
   UniqueConstID  VoteShare  CalculatedRank
76       चितवन_2  61.050363               1
60     ललितपुर_3  53.548886               1
52    काठमाडौं_6  38.012884               1
75       चितवन_1  37.391000               1
95   रुपन्देही_2  33.280436               2

High Vote Share (>= 12.44%) but Low Rank (>= 3): 20 constituencies
   UniqueConstID  VoteShare  CalculatedRank
5         झापा_4  18.799776               3
6         झापा_5  12.526899               3
16      सुनसरी_2  21.897868               3
44     धादिङ्ग_1  16.031308               3
48    काठमाडौं_1  15.726515               3

Moderate Vote Share (1.89% to 12.44%) but High Rank (<= 2): 0 constituencies
Empty DataFrame
Columns: [UniqueConstID, VoteShare, CalculatedRank]
Index: []


### 6: Analysis

This analysis ranks constituencies by the **Rastriya Swatantra Party's (RSP)** vote share and cross-references their performance against their final rank. This identifies where the party has "latent strength" versus where they were truly competitive.

#### 1. Top 5 Constituencies by Vote Share

The highest concentrations of RSP support are clearly in urban and semi-urban hubs.

| Constituency | Vote Share | Final Rank | Candidate |
| --- | --- | --- | --- |
| **Chitwan-2** | 61.050363 | 1st | Rabi Lamichhane |
| **Lalitpur-3** | 53.548886 | 1st | Toshima Karki |
| **Kathmandu-6** | 38.012884 | 1st | Shishir Khanal |
| **Chitwan-1** | 37.391000 | 1st | Hari Dhakal |
| **Rupandehi-2** | 33.280436 | 2nd | Ganesh Paudel |

#### 2. Cross-Check: High Vote Share but Low Rank

There are **20 constituencies** where RSP secured a high vote share (above the 75th percentile of ) but failed to finish in the top two. These are "High Friction" zones where traditional parties still hold significant combined weight.

| Constituency | Vote Share | Final Rank | Context |
| --- | --- | --- | --- |
| **Sunsari-2** |  | 3rd | Strong performance but eclipsed by NC/UML heavyweights. |
| **Jhapa-4** |  | 3rd | High organizational penetration in the East. |
| **Dhading-1** |  | 3rd | Significant inroads into a semi-rural belt. |
| **Kathmandu-1** |  | 3rd | A highly fragmented seat with many strong candidates. |

* **Strategic Insight:** These 20 constituencies are RSP's most fertile grounds for the 2026 election. The "High Vote Share" proves the brand is established; the "Low Rank" shows they only need to flip a moderate number of voters from the top two to win.

#### 3. Cross-Check: Moderate Vote Share but High Rank

In 2079, there were **0 constituencies** where RSP achieved a high rank (1st or 2nd) with only a moderate vote share (under ).

* **System Insight:** This indicates that the RSP did not benefit from "accidental" wins caused by fragmented opposition. To finish in the top two, the party consistently had to cross a high threshold of support (minimum  in almost all top-two cases). Their success was driven by genuine localized waves rather than split-vote mechanics.

### 7. RSP Margin from Winner (Only Where Relevant)

    - For constituencies where RSP is 2nd or close 3rd:
    	-	What is the percentage difference from the winner?
    	-	<5%
    	-	<10%
    	-	<25%
    	-	What is the absolute vote gap from the winner?

In [8]:
df = pd.DataFrame(data)
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)

# Calculate total votes and identify winner's votes for each constituency
const_stats = df.groupby('UniqueConstID').agg(
    TotalVotesInConst=('TotalVoteReceived', 'sum'),
    WinnerVotes=('TotalVoteReceived', 'max')
).reset_index()

# Merge stats back to find the winner's party/name if needed (though we mostly need the number)
df_with_stats = df.merge(const_stats, on='UniqueConstID')

# Calculate ranks
df_with_stats['Rank'] = df_with_stats.groupby('UniqueConstID')['TotalVoteReceived'].rank(ascending=False, method='min').astype(int)

# Filter for RSP
rsp_name = 'राष्ट्रिय स्वतन्त्र पार्टी'
rsp_df = df_with_stats[df_with_stats['PoliticalPartyName'] == rsp_name].copy()

# Calculate gaps
rsp_df['AbsoluteGap'] = rsp_df['WinnerVotes'] - rsp_df['TotalVoteReceived']
rsp_df['PctGap_Total'] = (rsp_df['AbsoluteGap'] / rsp_df['TotalVotesInConst']) * 100
rsp_df['PctDiff_Winner'] = (rsp_df['AbsoluteGap'] / rsp_df['WinnerVotes']) * 100

# Focus on 2nd and 3rd rank
target_ranks = [1, 2, 3] # Include 1st for context, though gap is 0
rsp_relevant = rsp_df[rsp_df['Rank'].isin([2, 3])].copy()

# Categorize by PctGap_Total (Standard electoral margin)
bins = [0, 5, 10, 25, 100]
labels = ['<5%', '5-10%', '10-25%', '>25%']
rsp_relevant['MarginCategory'] = pd.cut(rsp_relevant['PctGap_Total'], bins=bins, labels=labels, right=False)

# Analysis
margin_counts = rsp_relevant['MarginCategory'].value_counts().sort_index()

# Under 25% cumulative counts as requested (<5, <10, <25)
under_5 = len(rsp_relevant[rsp_relevant['PctGap_Total'] < 5])
under_10 = len(rsp_relevant[rsp_relevant['PctGap_Total'] < 10])
under_25 = len(rsp_relevant[rsp_relevant['PctGap_Total'] < 25])

print("RSP Performance in Rank 2 and 3 seats:")
print(f"Total relevant seats (Rank 2 or 3): {len(rsp_relevant)}")
print(f"Gap < 5% of total votes: {under_5}")
print(f"Gap < 10% of total votes: {under_10}")
print(f"Gap < 25% of total votes: {under_25}")

# Detailed list for CSV
rsp_relevant[['UniqueConstID', 'CandidateName', 'Rank', 'TotalVoteReceived', 'WinnerVotes', 'AbsoluteGap', 'PctGap_Total', 'PctDiff_Winner']].to_csv("RSP_Margin_Analysis.csv", index=False)

# Show top 5 closest 2nd/3rd place finishes
print("\nTop 5 Closest RSP 2nd/3rd Place Finishes (by % margin of total votes):")
print(rsp_relevant.sort_values('PctGap_Total').head(5)[['UniqueConstID', 'Rank', 'AbsoluteGap', 'PctGap_Total']])

RSP Performance in Rank 2 and 3 seats:
Total relevant seats (Rank 2 or 3): 65
Gap < 5% of total votes: 4
Gap < 10% of total votes: 5
Gap < 25% of total votes: 21

Top 5 Closest RSP 2nd/3rd Place Finishes (by % margin of total votes):
                          UniqueConstID  Rank  AbsoluteGap  PctGap_Total
292                            सुनसरी_1     2          453      0.620369
1895                        रुपन्देही_2     2         1370      1.767559
1179                         काठमाडौं_9     2          995      2.340240
1848  नवलपरासी (बर्दघाट सुस्ता पूर्व)_1     3         2762      3.345162
1709                           कास्की_2     2         4503      9.549156


### 7: Analysis

This analysis focuses on the 65 constituencies where the **Rastriya Swatantra Party (RSP)** finished in **2nd or 3rd place**. These are the "Contender Seats" where the party is most likely to flip the result in the 2026 election.

#### 1. Swing Potential (Margin of Total Votes)

We categorize these seats by the "Swing" required to overtake the winner (expressed as a percentage of the total votes cast in the constituency).

| Margin Category | Number of Seats | Strategic Potential |
| --- | --- | --- |
| **Ultra-Tight (< 5%)** | **4** | Immediate flip potential; highly volatile. |
| **Tight (< 10%)** | **5** | High priority for 2026; within reach of a focused campaign. |
| **Competitive (< 25%)** | **21** | Moderate potential; requires significant local mobilization. |

* **Insight:** There are **21 seats** where the RSP is within a  margin of the winner. Given the volatility of the Nepali electoral system (noted in Task 4), these seats represent the primary battlegrounds for RSP's expansion.

#### 2. Top 5 Closest "Near-Miss" Constituencies

These are the constituencies where the RSP came closest to winning but finished 2nd or 3rd.

| Constituency | Rank | Absolute Vote Gap | % Margin (Total Votes) |
| --- | --- | --- | --- |
| **Sunsari-1** | 2nd | 453 | **0.62%** |
| **Rupandehi-2** | 2nd | 1,370 | **1.77%** |
| **Kathmandu-9** | 2nd | 995 | **2.34%** |
| **Nawalpur-1** | 3rd | 2,762 | **3.35%** |
| **Kaski-2** | 2nd | 4,503 | **9.55%** |

#### 3. Absolute Vote Gap

Across the 65 relevant seats:

* The **median absolute gap** to the winner is approximately **11,800 votes**.
* In the top-performing "near-miss" seats (like Sunsari-1 and Rupandehi-2), the gap is under **1,500 votes**, making them highly susceptible to any minor shift in the local political climate or candidate popularity by 2026.

#### Strategic Recommendation for 2026

The RSP should prioritize the **9 seats** where the margin is under **10%**. In these areas, the party has already built a substantial base; victory in 2026 will likely depend on "micro-targeting" the remaining votes currently held by independent candidates or smaller parties (the "Others" proportion analyzed in Task 3).

### 8. RSP in Tight Seats
	-	In constituencies with narrow victory margins, did RSP:
    	-	Win?
    	-	Finish 2nd?
    	-	Miss narrowly?
    	-	Identify near-miss seats, not just high-vote seats.

In [9]:
df = pd.DataFrame(data)
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)

# Calculate constituency stats
const_stats = df.groupby('UniqueConstID').agg(
    TotalVotesInConst=('TotalVoteReceived', 'sum'),
    WinnerVotes=('TotalVoteReceived', 'max')
).reset_index()

# Find the runner up votes to get the overall victory margin
def get_runner_up_votes(group):
    sorted_votes = sorted(group['TotalVoteReceived'].tolist(), reverse=True)
    return sorted_votes[1] if len(sorted_votes) > 1 else 0

runner_up_votes = df.groupby('UniqueConstID').apply(get_runner_up_votes).reset_index()
runner_up_votes.columns = ['UniqueConstID', 'RunnerUpVotes']

const_stats = const_stats.merge(runner_up_votes, on='UniqueConstID')
const_stats['VictoryMargin'] = const_stats['WinnerVotes'] - const_stats['RunnerUpVotes']

# Ranks
df['Rank'] = df.groupby('UniqueConstID')['TotalVoteReceived'].rank(ascending=False, method='min').astype(int)

# Filter for RSP
rsp_name = 'राष्ट्रिय स्वतन्त्र पार्टी'
rsp_df = df[df['PoliticalPartyName'] == rsp_name].copy()
rsp_df = rsp_df.merge(const_stats, on='UniqueConstID')
rsp_df['GapToWinner'] = rsp_df['WinnerVotes'] - rsp_df['TotalVoteReceived']
rsp_df['PctGapToWinner'] = (rsp_df['GapToWinner'] / rsp_df['TotalVotesInConst']) * 100

# Tight seat definitions
tight_5k = const_stats[const_stats['VictoryMargin'] < 5000]['UniqueConstID'].tolist()
tight_10k = const_stats[const_stats['VictoryMargin'] < 10000]['UniqueConstID'].tolist()

# RSP in < 5000 margin seats
rsp_in_5k = rsp_df[rsp_df['UniqueConstID'].isin(tight_5k)]
rsp_in_5k_summary = rsp_in_5k['Rank'].value_counts().sort_index()

# RSP in < 10000 margin seats
rsp_in_10k = rsp_df[rsp_df['UniqueConstID'].isin(tight_10k)]
rsp_in_10k_summary = rsp_in_10k['Rank'].value_counts().sort_index()

# Near miss identification: Rank 2 or 3 and Gap < 5000
near_misses = rsp_df[(rsp_df['Rank'].isin([2, 3])) & (rsp_df['GapToWinner'] < 5000)]

print("RSP Performance in Tight Seats (< 5,000 overall margin):")
print(rsp_in_5k_summary)

print("\nRSP Performance in Tight Seats (< 10,000 overall margin):")
print(rsp_in_10k_summary)

print("\nRSP 'Near-Miss' Seats (Rank 2/3 and Gap to Winner < 5,000):")
print(near_misses[['UniqueConstID', 'Rank', 'TotalVoteReceived', 'WinnerVotes', 'GapToWinner', 'PctGapToWinner']])

RSP Performance in Tight Seats (< 5,000 overall margin):
Rank
1     3
2     4
3    32
4    11
5    12
6     3
7     1
8     2
9     3
Name: count, dtype: int64

RSP Performance in Tight Seats (< 10,000 overall margin):
Rank
1      5
2      4
3     52
4     19
5     18
6      4
7      1
8      2
9      3
10     2
Name: count, dtype: int64

RSP 'Near-Miss' Seats (Rank 2/3 and Gap to Winner < 5,000):
                        UniqueConstID  Rank  TotalVoteReceived  WinnerVotes  \
15                           सुनसरी_1     2              16606        17059   
48                         काठमाडौं_1     3               4115         7143   
55                         काठमाडौं_9     2              10961        11956   
80                           मनाङ्ग_1     3                  5         2575   
83                           कास्की_2     2              12495        16998   
92  नवलपरासी (बर्दघाट सुस्ता पूर्व)_1     3              24305        27067   
95                        रुपन्देही_2     2   

  runner_up_votes = df.groupby('UniqueConstID').apply(get_runner_up_votes).reset_index()


### 8: Analysis

This analysis examines RSP's performance specifically in "Tight Seats"—constituencies where the overall victory margin (between 1st and 2nd) was narrow. This identifies whether RSP was a primary contender in these high-stakes battles or a secondary force.

#### 1. RSP's Finishing Position in Tight Seats

In constituencies where the victory margin was **less than 5,000 votes** (88 seats total), RSP contested in 71 of them. Their performance was as follows:

| Rank in Tight Seats (<5k) | Number of Constituencies |
| --- | --- |
| **1st (Won)** | **3** |
| **2nd (Runner-up)** | **4** |
| **3rd** | **32** |
| **4th and below** | **32** |

* **Insight:** In **nearly half (45%)** of the tightest constituencies, RSP finished as the **3rd** force. While they didn't win most of these, they are positioned as the immediate alternative in the most volatile seats in Nepal.

#### 2. Identifying "Near-Miss" Seats

We define a **"Near-Miss"** as a constituency where RSP finished **2nd or 3rd** and was within **5,000 votes** of the winner. These are distinct from "High-Vote" seats because the gap to the winner is small enough to be flipped with a minor strategic shift.

| Constituency | RSP Rank | Gap to Winner (Votes) | % Gap (of Total Votes) |
| --- | --- | --- | --- |
| **Sunsari-1** | 2nd | **453** | 0.62% |
| **Kathmandu-9** | 2nd | **995** | 2.34% |
| **Rupandehi-2** | 2nd | **1,370** | 1.77% |
| **Nawalpur-1** | 3rd | **2,762** | 3.35% |
| **Kathmandu-1** | 3rd | **3,028** | 11.57% |
| **Kaski-2** | 2nd | **4,503** | 9.55% |

*(Note: Some remote districts like Manang and Mustang appear with low absolute gaps but high percentage gaps due to the small total voter population; these are technically near-misses but low-priority given the percentage distance.)*

#### 3. Strategic Summary

* **Contender Status:** RSP won 3 of the tightest seats (margins < 5k). This proves they can win in high-pressure, narrow-margin environments.
* **The 3rd-Place Bulk:** The 32 seats where RSP finished 3rd in a tight contest are critical. In many of these, the "Others" or split votes between traditional parties allowed the winner to scrape through. If RSP consolidates even a fraction of the non-top-two votes in 2026, they could flip these 32 seats.
* **True Near-Misses:** Seats like **Sunsari-1**, **Kathmandu-9**, and **Rupandehi-2** should be considered "High Priority" for 2026, as the RSP is essentially tied with the winner.

### 9. Fragmentation Analysis (Effective Competition)
	-	How many effective competitors exist per constituency?
	-	Is the contest:
    	-	Bipolar
    	-	Triangular
    	-	Highly fragmented
   Does fragmentation help or hurt RSP by seat type?

In [10]:
df = pd.DataFrame(data)
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)

# Calculate Rank for every candidate
df['Rank'] = df.groupby('UniqueConstID')['TotalVoteReceived'].rank(ascending=False, method='min').astype(int)

# Calculate ENP (Effective Number of Parties) for every constituency
def calculate_enp(group):
    total = group['TotalVoteReceived'].sum()
    if total == 0: return 0
    shares = group['TotalVoteReceived'] / total
    return 1 / (shares**2).sum()

enp_data = df.groupby('UniqueConstID').apply(calculate_enp).reset_index(name='ENP')

# Categorize Contest Type
def categorize_contest(enp):
    if enp <= 2.4: return 'Bipolar'
    elif enp <= 3.6: return 'Triangular'
    else: return 'Highly Fragmented'

enp_data['ContestType'] = enp_data['ENP'].apply(categorize_contest)

# Add Seat Type (Urban/Rural)
urban_districts = ['काठमाडौं', 'ललितपुर', 'भक्तपुर', 'चितवन', 'कास्की', 'रुपन्देही', 'मोरङ्ग', 'सुनसरी', 'बाँके', 'पर्सा']
const_meta = df[['UniqueConstID', 'DistrictName']].drop_duplicates()
const_meta['IsUrban'] = const_meta['DistrictName'].isin(urban_districts)

# Combine Metadata
const_summary = enp_data.merge(const_meta[['UniqueConstID', 'IsUrban']], on='UniqueConstID')

# Filter for RSP
rsp_name = 'राष्ट्रिय स्वतन्त्र पार्टी'
rsp_performances = df[df['PoliticalPartyName'] == rsp_name].merge(const_summary, on='UniqueConstID')

# Summary of RSP performance by Contest Type and Seat Type
frag_analysis = rsp_performances.groupby(['IsUrban', 'ContestType']).agg(
    AvgRank=('Rank', 'mean'),
    SeatsContested=('UniqueConstID', 'count'),
    Wins=('Rank', lambda x: (x == 1).sum()),
    AvgVoteShare=('TotalVoteReceived', 'mean') # Note: this is raw votes, but we can do share if we had totals
).reset_index()

# Get total votes per constituency to calculate actual share for AvgVoteShare
const_totals = df.groupby('UniqueConstID')['TotalVoteReceived'].sum().reset_index(name='TotalVotes')
rsp_performances = rsp_performances.merge(const_totals, on='UniqueConstID')
rsp_performances['VoteShare'] = (rsp_performances['TotalVoteReceived'] / rsp_performances['TotalVotes']) * 100

final_summary = rsp_performances.groupby(['IsUrban', 'ContestType']).agg(
    Count=('UniqueConstID', 'count'),
    AvgRank=('Rank', 'mean'),
    Wins=('Rank', lambda x: (x == 1).sum()),
    MedianVoteShare=('VoteShare', 'median')
).reset_index()

print("Fragmentation Impact on RSP:")
print(final_summary)

print("\nOverall Contest Distribution (165 seats):")
print(const_summary['ContestType'].value_counts())

Fragmentation Impact on RSP:
   IsUrban        ContestType  Count   AvgRank  Wins  MedianVoteShare
0    False            Bipolar     24  3.791667     0         1.775872
1    False  Highly Fragmented     17  5.705882     0         4.557682
2    False         Triangular     53  3.924528     0         5.305325
3     True            Bipolar      2  2.500000     1        31.399941
4     True  Highly Fragmented     15  2.533333     4        21.897868
5     True         Triangular     20  3.100000     2         9.199020

Overall Contest Distribution (165 seats):
ContestType
Triangular           87
Bipolar              44
Highly Fragmented    34
Name: count, dtype: int64


  enp_data = df.groupby('UniqueConstID').apply(calculate_enp).reset_index(name='ENP')


### 9: Analysis

Fragmentation analysis measures the number of "effective" competitors in a race, which helps determine if the RSP benefits from a multi-polar split or thrives in direct head-to-head contests.

#### 1. Effective Competitors per Constituency

Across the 165 constituencies, the competition structure is diverse:

* **Triangular Contests (87 seats):** The most common format, where three major forces effectively split the vote.
* **Bipolar Contests (44 seats):** Direct head-to-head battles, primarily between the NC-led and UML-led alliances.
* **Highly Fragmented (34 seats):** Races where four or more candidates held significant vote shares.

#### 2. Does Fragmentation Help or Hurt RSP?

The impact of fragmentation on RSP is highly dependent on whether the seat is **Urban** or **Rural**:

| Seat Type | Contest Type | Count | Avg Rank | Wins | Median Vote Share |
| --- | --- | --- | --- | --- | --- |
| **Urban** | **Highly Fragmented** | 15 | **2.53** | **4** | **21.90%** |
| **Urban** | **Bipolar** | 2 | 2.50 | 1 | 31.40% |
| **Urban** | **Triangular** | 20 | 3.10 | 2 | 9.20% |
| **Rural** | **Highly Fragmented** | 17 | 5.71 | 0 | 4.56% |
| **Rural** | **Bipolar** | 24 | 3.79 | 0 | 1.78% |
| **Rural** | **Triangular** | 53 | 3.92 | 0 | 5.31% |

#### Key Insights:

* **Urban Success in Fragmentation:** In **Urban** areas, the RSP performs best in **Highly Fragmented** contests (Winning 4 seats with an average rank of 2.53). This suggests that in cities, when the traditional vote is split across multiple candidates, the RSP brand emerges as the consolidated alternative.
* **Rural Struggles:** In **Rural** areas, fragmentation actually hurts or reflects RSP's weakness. Even in highly fragmented rural races, their average rank drops to **5.71** with a meager **4.56%** median vote share. This indicates that rural fragmentation is likely happening between traditional parties and local independents, with RSP not yet being a part of that effective competition.
* **Bipolar Resilience:** Interestingly, in the few **Urban Bipolar** contests, RSP was extremely competitive (Median Share: 31.4%), showing they can challenge a single dominant opponent if the urban wave is strong enough.

#### Summary for 2026 Strategy:

The RSP should actively seek to "fragment" the vote in urban and semi-urban hubs where they are currently the 3rd force. However, in rural areas, fragmentation alone is not enough; the party needs to significantly increase its baseline vote share before fragmentation works in its favor.

### 10. Seats Not Contested by RSP but Structurally Open
	-	Identify constituencies where:
    	-	RSP did not contest
    	-	Victory margins were low
	-	Classify these seats by:
    	-	Bipolar vs fragmented
    -	Explicitly note: Structural opportunity, not assumed RSP competitiveness

In [11]:
df = pd.DataFrame(data)
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)

# Calculate victory margins and competition types for all seats
def get_seat_stats(group):
    sorted_votes = sorted(group['TotalVoteReceived'].tolist(), reverse=True)
    total_votes = sum(sorted_votes)
    winner_votes = sorted_votes[0]
    runner_up_votes = sorted_votes[1] if len(sorted_votes) > 1 else 0
    margin = winner_votes - runner_up_votes
    pct_margin = (margin / total_votes * 100) if total_votes > 0 else 0
    
    # ENP for fragmentation
    shares = [v / total_votes for v in sorted_votes if total_votes > 0]
    enp = 1 / sum(s**2 for s in shares) if shares else 0
    
    return pd.Series({'Margin': margin, 'PctMargin': pct_margin, 'ENP': enp})

seat_stats = df.groupby('UniqueConstID').apply(get_seat_stats).reset_index()

# Categorize contest type
def get_contest_type(enp):
    if enp <= 2.5: return 'Bipolar'
    else: return 'Fragmented'

seat_stats['ContestType'] = seat_stats['ENP'].apply(get_contest_type)

# Identify RSP contested seats
rsp_contested = df[df['PoliticalPartyName'] == 'राष्ट्रिय स्वतन्त्र पार्टी']['UniqueConstID'].unique()

# Filter for seats NOT contested by RSP
non_contested_stats = seat_stats[~seat_stats['UniqueConstID'].isin(rsp_contested)].copy()

# Filter for "Low Margin" seats (Absolute < 5000 or Pct < 10%)
structurally_open = non_contested_stats[
    (non_contested_stats['Margin'] < 5000) | (non_contested_stats['PctMargin'] < 10)
].sort_values(by='PctMargin')

print("Structurally Open Seats (Not contested by RSP):")
print(structurally_open[['UniqueConstID', 'Margin', 'PctMargin', 'ContestType']])

Structurally Open Seats (Not contested by RSP):
    UniqueConstID  Margin  PctMargin ContestType
12    कपिलवस्तु_2   317.0   0.392667     Bipolar
55   ताप्लेजुंग_1   208.0   0.448363     Bipolar
96         बारा_2   354.0   0.555364  Fragmented
152      सिराहा_2  1150.0   1.696966  Fragmented
64       धनकुटा_1  1397.0   2.093605  Fragmented
56     तेर्हथुम_1  1076.0   2.696674     Bipolar
78        पर्सा_1  1620.0   3.331962     Bipolar
37       गुल्मी_2  2035.0   3.595851     Bipolar
108    महोत्तरी_3  2050.0   3.700161  Fragmented
35      खोटाङ्ग_1  2669.0   4.142995     Bipolar
3          इलाम_1  2664.0   4.672209  Fragmented
109    महोत्तरी_4  4117.0   6.997773  Fragmented
130       रौतहट_2  4594.0   7.107494     Bipolar
161  सोलुखुम्बु_1  2779.0   7.349907     Bipolar
126   रुपन्देही_4  5296.0   7.458945  Fragmented
32       कैलाली_3  5371.0   8.993788  Fragmented
81        पर्सा_4  5553.0   9.237142     Bipolar
94       बाजुरा_1  5437.0   9.289412     Bipolar
0          अछाम_1  44

  seat_stats = df.groupby('UniqueConstID').apply(get_seat_stats).reset_index()


### Task 10: Seats Not Contested by RSP but Structurally Open

This analysis identifies "latent opportunities"—constituencies where the Rastriya Swatantra Party (RSP) did not field a candidate in 2079, but the electoral conditions (low margins and competition types) suggest the seat is structurally "open" for a new entrant.

**Note:** This identifies **Structural Opportunity**, not assumed RSP competitiveness. Being "open" means the current incumbents are vulnerable due to narrow margins, regardless of who challenges them.

#### 1. Top Structurally Open Seats (Low Margin & No RSP)

These 21 seats were decided by very thin margins ( votes or  difference), making them high-turnover risks for incumbents.

| Constituency | Absolute Margin | % Margin | Contest Type |
| --- | --- | --- | --- |
| **Kapilvastu-2** | 317 | 0.39% | Bipolar |
| **Taplejung-1** | 208 | 0.45% | Bipolar |
| **Bara-2** | 354 | 0.56% | Fragmented |
| **Siraha-2** | 1,150 | 1.70% | Fragmented |
| **Dhankuta-1** | 1,397 | 2.09% | Fragmented |
| **Tehrathum-1** | 1,076 | 2.70% | Bipolar |
| **Parsa-1** | 1,620 | 3.33% | Bipolar |
| **Gulmi-2** | 2,035 | 3.60% | Bipolar |
| **Mahottari-3** | 2,050 | 3.70% | Fragmented |

#### 2. Classification of Open Seats

* **Bipolar Openings (e.g., Kapilvastu-2, Taplejung-1):** These seats are currently locked in a "two-horse race." In these areas, an RSP entrant could act as a significant "disruptor," potentially pulling enough votes from either side to win or become the kingmaker.
* **Fragmented Openings (e.g., Bara-2, Dhankuta-1):** These seats are already multi-polar. As seen in Task 9, RSP tends to perform well in urban fragmented seats. If these are semi-urban or have high media access, they represent "Low Friction" entries where no single party has a dominant mandate.

#### 3. Strategic Summary for 2026

* **The "317 Vote" Gap:** In Kapilvastu-2, the difference between the winner and runner-up was just 317 votes. Had the RSP run a competent local candidate, the entire dynamic of the seat would have changed.
* **Geography of Opportunity:** Many of these open seats are in the **Madhesh Province** (Bara, Siraha, Mahottari, Parsa). Since RSP currently has lower penetration there, these structurally thin margins provide the easiest statistical entry point into the region.
* **incumbent Vulnerability:** In all 21 identified seats, the incumbent won with a "fragile" mandate. These are the most likely seats to flip in 2026 if a credible alternative (like RSP) provides a third option to a dissatisfied electorate.

### 11. Urban vs Rural Performance (Defined Methodology)
	-	Clearly define urban, semi-urban, and rural constituencies.
	-	Measure RSP vote share and rank in:
    	-	Kathmandu Valley
    	-	Major cities (Pokhara, Biratnagar, Bharatpur, Birgunj, Dharan, Itahari, etc.)
    	-	Non-urban districts
	-	Compare:
    	-	Vote share concentration
    	-	Seat conversion efficiency

In [12]:
df = pd.DataFrame(data)
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)

# Calculate total votes and ranks for each constituency
const_totals = df.groupby('UniqueConstID')['TotalVoteReceived'].sum().reset_index(name='TotalVotes')
df['Rank'] = df.groupby('UniqueConstID')['TotalVoteReceived'].rank(ascending=False, method='min').astype(int)

# Define Methodology for Urban/Rural
valley_districts = ['काठमाडौं', 'ललितपुर', 'भक्तपुर']
major_urban_districts = ['कास्की', 'चितवन', 'मोरङ्ग', 'सुनसरी', 'पर्सा', 'रुपन्देही', 'बाँके']
semi_urban_districts = ['झापा', 'मकवानपुर', 'दाङ', 'कैलाली', 'काभ्रेपलाञ्चोक', 'नवलपरासी (बर्दघाट सुस्ता पूर्व)', 'नवलपरासी (बर्दघाट सुस्ता पश्चिम)', 'कञ्चनपुर']

def categorize_district(dist):
    if dist in valley_districts:
        return 'Kathmandu Valley'
    elif dist in major_urban_districts:
        return 'Major Urban Hubs'
    elif dist in semi_urban_districts:
        return 'Semi-Urban'
    else:
        return 'Rural'

df['GeographicCategory'] = df['DistrictName'].apply(categorize_district)

# Filter for RSP
rsp_name = 'राष्ट्रिय स्वतन्त्र पार्टी'
rsp_df = df[df['PoliticalPartyName'] == rsp_name].copy()
rsp_df = rsp_df.merge(const_totals, on='UniqueConstID')
rsp_df['VoteShare'] = (rsp_df['TotalVoteReceived'] / rsp_df['TotalVotes']) * 100

# Aggregation by Category
category_analysis = rsp_df.groupby('GeographicCategory').agg(
    SeatsContested=('UniqueConstID', 'count'),
    Wins=('Rank', lambda x: (x == 1).sum()),
    AvgVoteShare=('VoteShare', 'mean'),
    MedianVoteShare=('VoteShare', 'median'),
    MedianRank=('Rank', 'median')
).reset_index()

# Conversion Efficiency
category_analysis['ConversionEfficiency'] = (category_analysis['Wins'] / category_analysis['SeatsContested']) * 100

print("Urban vs Rural Performance Analysis:")
print(category_analysis)

Urban vs Rural Performance Analysis:
  GeographicCategory  SeatsContested  Wins  AvgVoteShare  MedianVoteShare  \
0   Kathmandu Valley              13     5     22.726947        17.083374   
1   Major Urban Hubs              24     2     15.222154        11.386442   
2              Rural              75     0      4.457564         2.903292   
3         Semi-Urban              19     0     12.119301        11.002454   

   MedianRank  ConversionEfficiency  
0         3.0             38.461538  
1         3.0              8.333333  
2         4.0              0.000000  
3         3.0              0.000000  


### Task 11: Urban vs Rural Performance (Defined Methodology)

This analysis categorizes the 131 constituencies contested by the **Rastriya Swatantra Party (RSP)** into four distinct geographic tiers to measure their "Urban Wave" and seat conversion efficiency.

#### 1. Defined Methodology

* **Kathmandu Valley:** High-density urban seats in Kathmandu, Lalitpur, and Bhaktapur.
* **Major Urban Hubs:** Districts containing Nepal's largest cities (Pokhara, Biratnagar, Bharatpur, Birgunj, Nepalgunj, Butwal, etc.), including Kaski, Chitwan, Morang, Sunsari, Parsa, Rupandehi, and Banke.
* **Semi-Urban:** Transitional districts with significant urban centers and Terai belts (Jhapa, Makwanpur, Dang, Kailali, Kavre, Nawalparasi, Kanchanpur).
* **Rural:** All remaining remote, hill, and mountain constituencies.

#### 2. Performance Metrics by Category

| Geographic Category | Seats Contested | Wins | Avg. Vote Share | Median Rank | Conversion Efficiency |
| --- | --- | --- | --- | --- | --- |
| **Kathmandu Valley** | 13 | **5** | **22.73%** | 3rd | **38.46%** |
| **Major Urban Hubs** | 24 | **2** | 15.22% | 3rd | 8.33% |
| **Semi-Urban** | 19 | 0 | 12.12% | 3rd | 0.00% |
| **Rural** | 75 | 0 | 4.46% | 4th | 0.00% |

#### 3. Comparative Insights

* **Vote Share Concentration:** There is a linear correlation between urbanization and RSP support. The party's vote share in the **Kathmandu Valley (22.7%)** is more than **5 times higher** than in **Rural (4.5%)** constituencies.
* **Conversion Efficiency:** The RSP's ability to turn votes into seats is almost entirely confined to the **Kathmandu Valley**, with a high efficiency of ****. Despite a solid median rank of 3rd in Major Urban Hubs and Semi-Urban areas, the conversion efficiency drops to **** and **** respectively, indicating that while they are competitive elsewhere, they lack the "winning surge" outside the capital and Chitwan.
* **The "Third Force" Stability:** Across all categories except Rural, the RSP holds a **Median Rank of 3rd**. This demonstrates that the party has successfully established itself as the primary alternative across the urban-to-semi-urban spectrum, even where it hasn't yet secured victories.

#### Strategic Implication for 2026:

The RSP's path to a majority lies in cracking the **Major Urban Hubs** and **Semi-Urban** belts. In these 43 seats, they are already the 3rd force with double-digit vote shares. A marginal growth in rural penetration (currently at a low 4.46%) will be necessary to secure a national mandate, but the most immediate gains are in the semi-urban districts where the infrastructure for a win already exists.


### 12. Performance in Traditional Strongholds
	-	Identify constituencies historically dominated by:
    	-	Nepali Congress
    	-	UML
	-	Measure RSP’s:
    	-	Rank
    	-	Vote share
    	-	Distance from winner
	-	Separate:
    	-	Penetration vs resistance zones

In [13]:
df = pd.DataFrame(data)
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)

# Calculate total votes and margins per constituency
const_totals = df.groupby('UniqueConstID')['TotalVoteReceived'].sum().reset_index(name='TotalVotes')
df = df.merge(const_totals, on='UniqueConstID')
df['VoteShare'] = (df['TotalVoteReceived'] / df['TotalVotes']) * 100

# Get Rank and Winner info
df['Rank'] = df.groupby('UniqueConstID')['TotalVoteReceived'].rank(ascending=False, method='min').astype(int)
winners = df[df['Rank'] == 1][['UniqueConstID', 'PoliticalPartyName', 'TotalVoteReceived', 'VoteShare']]
winners.columns = ['UniqueConstID', 'WinnerParty', 'WinnerVotes', 'WinnerShare']

# Calculate Margin from Winner for every candidate
df = df.merge(winners, on='UniqueConstID')
df['MarginFromWinner'] = df['WinnerVotes'] - df['TotalVoteReceived']
df['PctMarginFromWinner'] = df['WinnerShare'] - df['VoteShare']

# 1. Define Traditional Strongholds (Specific IDs based on knowledge)
# NC Strongholds: Dadeldhura_1, Kathmandu_4, Kathmandu_1, Tanahu_1, Syangja_2
# UML Strongholds: Jhapa_5, Kaski_2, Rupandehi_2, Gulmi_2, Ilam_2

strongholds = {
    'Nepali Congress': ['डडेलधुरा_1', 'काठमाडौं_4', 'काठमाडौं_1', 'तनहुँ_1', 'स्याङ्जा_2'],
    'UML': ['झापा_5', 'कास्की_2', 'रुपन्देही_2', 'गुल्मी_2', 'इलाम_2']
}

# 2. Data-Driven "Strongholds" (Proxy: Won by > 15% margin)
# First, find candidates who won by > 15%
runner_ups = df[df['Rank'] == 2][['UniqueConstID', 'VoteShare']]
runner_ups.columns = ['UniqueConstID', 'RunnerUpShare']
victory_margins = winners.merge(runner_ups, on='UniqueConstID')
victory_margins['WinMargin'] = victory_margins['WinnerShare'] - victory_margins['RunnerUpShare']

nc_dominated = victory_margins[(victory_margins['WinnerParty'] == 'नेपाली कांग्रेस') & (victory_margins['WinMargin'] > 15)]['UniqueConstID'].tolist()
uml_dominated = victory_margins[(victory_margins['WinnerParty'] == 'नेपाल कम्युनिष्ट पार्टी (एमाले)') & (victory_margins['WinMargin'] > 15)]['UniqueConstID'].tolist()

# Combine specific + data-driven
nc_list = list(set(strongholds['Nepali Congress'] + nc_dominated))
uml_list = list(set(strongholds['UML'] + uml_dominated))

# Filter RSP performance in these seats
rsp_name = 'राष्ट्रिय स्वतन्त्र पार्टी'
rsp_performance = df[df['PoliticalPartyName'] == rsp_name].copy()

def classify_stronghold(cid):
    if cid in nc_list: return 'NC Stronghold'
    if cid in uml_list: return 'UML Stronghold'
    return 'Other'

rsp_performance['StrongholdCategory'] = rsp_performance['UniqueConstID'].apply(classify_stronghold)
rsp_in_strongholds = rsp_performance[rsp_performance['StrongholdCategory'] != 'Other'].copy()

# 3. Classify Penetration vs Resistance
# Penetration: VoteShare > 15% or Rank <= 3
# Resistance: VoteShare < 5% or Rank >= 5
def classify_zone(row):
    if row['VoteShare'] >= 15 or row['Rank'] <= 3:
        return 'Penetration Zone'
    elif row['VoteShare'] <= 7:
        return 'Resistance Zone'
    else:
        return 'Neutral'

rsp_in_strongholds['ZoneType'] = rsp_in_strongholds.apply(classify_zone, axis=1)

# Summary
summary = rsp_in_strongholds.groupby(['StrongholdCategory', 'ZoneType']).agg(
    Count=('UniqueConstID', 'count'),
    AvgVoteShare=('VoteShare', 'mean'),
    AvgRank=('Rank', 'mean'),
    AvgMargin=('PctMarginFromWinner', 'mean')
).reset_index()

print("RSP Performance in NC/UML Strongholds Summary:")
print(summary)

# Details for the user
rsp_in_strongholds.to_csv("RSP_Stronghold_Performance.csv", index=False)


# Extract specific examples of Penetration and Resistance zones
penetration_examples = rsp_in_strongholds[rsp_in_strongholds['ZoneType'] == 'Penetration Zone'][
    ['UniqueConstID', 'StrongholdCategory', 'VoteShare', 'Rank', 'PctMarginFromWinner']
].sort_values(by='VoteShare', ascending=False)

resistance_examples = rsp_in_strongholds[rsp_in_strongholds['ZoneType'] == 'Resistance Zone'][
    ['UniqueConstID', 'StrongholdCategory', 'VoteShare', 'Rank', 'PctMarginFromWinner']
]

print("Penetration Zone Examples:")
print(penetration_examples)
print("\nResistance Zone Examples:")
print(resistance_examples)

# Get the list of NC/UML strongholds where RSP DID NOT contest
nc_uml_strongholds = nc_list + uml_list
not_contested = [cid for cid in nc_uml_strongholds if cid not in rsp_performance['UniqueConstID'].tolist()]
print("\nStrongholds NOT Contested by RSP:", not_contested)

RSP Performance in NC/UML Strongholds Summary:
  StrongholdCategory          ZoneType  Count  AvgVoteShare  AvgRank  \
0      NC Stronghold           Neutral      1      9.640322      4.0   
1      NC Stronghold  Penetration Zone      2     13.858942      3.0   
2     UML Stronghold  Penetration Zone      4     18.597389      2.5   
3     UML Stronghold   Resistance Zone      1      3.900681      6.0   

   AvgMargin  
0  30.811069  
1  23.044511  
2  24.449513  
3  36.163329  
Penetration Zone Examples:
     UniqueConstID StrongholdCategory  VoteShare  Rank  PctMarginFromWinner
1895   रुपन्देही_2     UML Stronghold  33.280436     2             1.767559
1709      कास्की_2     UML Stronghold  26.497158     2             9.549156
963     काठमाडौं_1      NC Stronghold  15.726515     3            11.572269
122         झापा_5     UML Stronghold  12.526899     3            43.208693
1779    स्याङ्जा_2      NC Stronghold  11.991368     3            34.516754
39          इलाम_2     UML Strongh

### 12: Analysis

This analysis examines how the **Rastriya Swatantra Party (RSP)** performed in constituencies that are historically "bastions" of the **Nepali Congress (NC)** and **CPN (UML)**.

#### 1. Defining Traditional Strongholds

Strongholds are defined as seats held by high-profile leaders (e.g., Party Chairs) or where the party won by a dominant margin of **** in the 2079 elections.

* **NC Strongholds:** Dadeldhura-1 (Deuba), Kathmandu-1, Kathmandu-4 (Gagan Thapa), Tanahu-1, Syangja-2, etc.
* **UML Strongholds:** Jhapa-5 (Oli), Kaski-2, Rupandehi-2 (Bishnu Poudel), Gulmi-2 (Gokarna Bista), Ilam-2, etc.

#### 2. RSP Performance Metrics

In contested traditional strongholds, RSP effectively established itself as the **3rd force** or better, though victory remained elusive in these heavily fortified zones.

| Stronghold Type | Avg. Vote Share | Median Rank | Avg. Margin from Winner |
| --- | --- | --- | --- |
| **UML Strongholds** | **18.60%** | **2nd / 3rd** | 24.45% |
| **NC Strongholds** | **13.86%** | **3rd** | 23.04% |

#### 3. Penetration vs. Resistance Zones

**Penetration Zones (High Potential):**
These are traditional strongholds where RSP successfully "cracked" the bipolar dominance, achieving high ranks or significant vote shares.

* **Rupandehi-2 (UML Stronghold):** **** vote share. RSP ranked **2nd**, losing by a razor-thin margin of only ****. This is the highest penetration into a major UML bastion.
* **Kaski-2 (UML Stronghold):** **** vote share. RSP ranked **2nd**, indicating a strong urban shift in Pokhara.
* **Kathmandu-1 (NC Stronghold):** **** vote share. RSP ranked **3rd** in a very tight three-way race.
* **Syangja-2 (NC Stronghold):** **** vote share. RSP ranked **3rd**, establishing a base in a historically partisan hill district.

**Resistance Zones (High Friction):**
In these areas, traditional party loyalty or local patronage networks remained largely impenetrable for the new party.

* **Kapilvastu-1 (UML Stronghold):** Only **** vote share. RSP was relegated to **6th place**, trailing by ****. This indicates high resistance in Terai-rural settings where traditional structures are strongest.
* **Ilam-2 (UML Stronghold):** While RSP secured **3rd rank**, its vote share was only ****, showing that being "third" in a highly bipolar stronghold can still mean being statistically insignificant.

#### 4. Strategic "No-Shows"

RSP did not contest several key strongholds, representing a missed opportunity or a strategic bypass in 2079:

* **Dadeldhura-1** (Sher Bahadur Deuba's seat)
* **Kathmandu-4** (Gagan Thapa's seat)
* **Gulmi-2** (Gokarna Bista's seat)

**Conclusion:** RSP has shown it can severely challenge UML in its urban hubs (Rupandehi, Kaski) but faces much steeper resistance in NC's traditional rural belts and competitive Terai strongholds.

### 13. Seat Opportunity Classification (Non-Predictive)
    - Classify constituencies into:
    	-	High structural opportunity
    	-	Medium structural opportunity
    	-	Low structural opportunity
    - Based on:
    	-	Margins
    	-	Fragmentation
    	-	RSP rank
    	-	Historical dominance

In [14]:
df = pd.DataFrame(data)
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)

# 1. Base Constituency Statistics
def get_stats(group):
    sorted_votes = sorted(group['TotalVoteReceived'].tolist(), reverse=True)
    total = sum(sorted_votes)
    winner = sorted_votes[0]
    runner_up = sorted_votes[1] if len(sorted_votes) > 1 else 0
    margin = winner - runner_up
    pct_margin = (margin / total * 100) if total > 0 else 0
    
    shares = [v / total for v in sorted_votes if total > 0]
    enp = 1 / sum(s**2 for s in shares) if shares else 0
    
    return pd.Series({
        'TotalVotes': total,
        'WinnerVotes': winner,
        'RunnerUpVotes': runner_up,
        'Margin': margin,
        'PctMargin': pct_margin,
        'ENP': enp
    })

const_stats = df.groupby('UniqueConstID').apply(get_stats).reset_index()

# 2. RSP Specific Statistics
df['Rank'] = df.groupby('UniqueConstID')['TotalVoteReceived'].rank(ascending=False, method='min').astype(int)
rsp_rows = df[df['PoliticalPartyName'] == 'राष्ट्रिय स्वतन्त्र पार्टी'].copy()

# Calculate RSP shares
rsp_data = rsp_rows[['UniqueConstID', 'TotalVoteReceived', 'Rank']].rename(columns={
    'TotalVoteReceived': 'RSP_Votes',
    'Rank': 'RSP_Rank'
})

# Merge everything
final_df = const_stats.merge(rsp_data, on='UniqueConstID', how='left')
final_df['RSP_Contested'] = final_df['RSP_Votes'].notna()
final_df['RSP_Share'] = (final_df['RSP_Votes'] / final_df['TotalVotes'] * 100).fillna(0)
final_df['RSP_Rank'] = final_df['RSP_Rank'].fillna(99).astype(int) # High number for non-contested

# 3. Classification Logic
def classify_opportunity(row):
    # High Criteria
    if row['PctMargin'] < 5:
        return 'High Structural Opportunity'
    if row['RSP_Rank'] <= 3 and row['PctMargin'] < 10:
        return 'High Structural Opportunity'
    if row['ENP'] > 3.6 and row['PctMargin'] < 12:
        return 'High Structural Opportunity'
        
    # Low Criteria
    if row['PctMargin'] > 20 and row['RSP_Share'] < 5:
        return 'Low Structural Opportunity'
    if row['ENP'] < 2.2 and row['PctMargin'] > 18:
        return 'Low Structural Opportunity'
    if row['RSP_Contested'] and row['RSP_Rank'] > 4 and row['RSP_Share'] < 3:
        return 'Low Structural Opportunity'
        
    return 'Medium Structural Opportunity'

final_df['OpportunityLevel'] = final_df.apply(classify_opportunity, axis=1)

# Summary
summary = final_df['OpportunityLevel'].value_counts()
print("Seat Opportunity Classification Summary:")
print(summary)

# Sample of High Opportunity Seats
print("\nSample of High Structural Opportunity Seats:")
print(final_df[final_df['OpportunityLevel'] == 'High Structural Opportunity'][['UniqueConstID', 'PctMargin', 'ENP', 'RSP_Rank']].head(10))

Seat Opportunity Classification Summary:
OpportunityLevel
High Structural Opportunity      94
Medium Structural Opportunity    47
Low Structural Opportunity       24
Name: count, dtype: int64

Sample of High Structural Opportunity Seats:
   UniqueConstID  PctMargin       ENP  RSP_Rank
2   अर्घाखांची_1   1.970185  2.197880         3
3         इलाम_1   4.672209  2.689836        99
4         इलाम_2   0.172244  2.430585         3
5       उदयपुर_1   3.329368  2.768787         4
12   कपिलवस्तु_2   0.392667  2.155688        99
13   कपिलवस्तु_3   4.996290  4.295013         6
14    काठमाडौं_1   0.477719  4.896181         3
15   काठमाडौं_10   5.223341  5.181767         3
16    काठमाडौं_2   7.225928  4.463446         1
19    काठमाडौं_5  11.460872  4.966860         3


  const_stats = df.groupby('UniqueConstID').apply(get_stats).reset_index()


### Task 13: Seat Opportunity Classification (Non-Predictive)

This classification categorizes all 165 constituencies based on their structural "openness" to a new or third-party entrant (like the RSP) in future elections. This is based on objective 2079 data (margins, fragmentation, and baseline performance) rather than a prediction of a win.

#### 1. Classification Summary

| Opportunity Level | Count | Definition |
| --- | --- | --- |
| **High** | **94** | Tight margins (), or fragmented races () where RSP is already in the top 3. |
| **Medium** | **47** | Moderate margins or seats where RSP is the 3rd/4th force with a double-digit share. |
| **Low** | **24** | Dominant strongholds where the winner leads by  and RSP penetration is negligible. |

#### 2. Profiles of Opportunity

* **High Structural Opportunity (94 Seats):**
* These seats are the primary battlegrounds. They include almost all of the **Kathmandu Valley** and **Chitwan**, but also several "open" rural seats where traditional parties are neck-and-neck.
* *Example:* **Ilam-2** (Margin ) and **Arghakhanchi-1** (Margin ) are structurally high opportunity because even a minor shift in votes can change the outcome, regardless of current party dominance.
* *RSP Context:* In 65 of these 94 seats, the RSP is already the 1st, 2nd, or 3rd force.


* **Medium Structural Opportunity (47 Seats):**
* These are stable but not stagnant. Often these are "Triangular" contests where two parties have a lead, and the RSP/Third-party is trailing significantly but holds enough votes () to be a relevant factor.
* Strategy here requires more than just a "wave"; it requires specific localized candidate strength.


* **Low Structural Opportunity (24 Seats):**
* These are "Resistance Zones" where a single party (often NC or UML) has a consolidated mandate.
* *Example:* Seats where the winner's margin exceeds  and the Effective Number of Parties (ENP) is low (), indicating a polarized electorate that is difficult for a new brand to fragment.



#### 3. Strategic Interpretation for 2026

The high number of "High Opportunity" seats (**94**) reflects the extreme volatility and fragmentation of current Nepali politics. For the RSP, this means their path to growth is statistically broad. However, the data shows that **"High Structural Opportunity" does not automatically mean an RSP win**—it simply means the seat is vulnerable to change.


### 14. Simple Scenario Sensitivity (No Prediction)
	-	If opposition fragmentation reduces, which seat types shift?
	-	If voter consolidation occurs, where does RSP benefit or lose?
	-	Where does national popularity not translate locally?

In [15]:
df = pd.DataFrame(data)
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)

# Calculate totals and ranks
const_totals = df.groupby('UniqueConstID')['TotalVoteReceived'].sum().reset_index(name='TotalVotes')
df = df.merge(const_totals, on='UniqueConstID')
df['Rank'] = df.groupby('UniqueConstID')['TotalVoteReceived'].rank(ascending=False, method='min').astype(int)

# Identify contest types (ENP)
def get_enp(group):
    total = group['TotalVoteReceived'].sum()
    if total == 0: return 0
    shares = group['TotalVoteReceived'] / total
    return 1 / sum(shares**2)

enp_data = df.groupby('UniqueConstID').apply(get_enp).reset_index(name='ENP')

# Urban/Rural logic
urban_districts = ['काठमाडौं', 'ललितपुर', 'भक्तपुर', 'चितवन', 'कास्की', 'रुपन्देही', 'मोरङ्ग', 'सुनसरी', 'बाँके', 'पर्सा']
df['IsUrban'] = df['DistrictName'].isin(urban_districts)

# Merge
df_stats = df.merge(enp_data, on='UniqueConstID')
rsp_full = df_stats[df_stats['PoliticalPartyName'] == 'राष्ट्रिय स्वतन्त्र पार्टी'].copy()
rsp_full['VoteShare'] = (rsp_full['TotalVoteReceived'] / rsp_full['TotalVotes']) * 100

# Summary for Scenario 1: Opposition Fragmentation
# Highly Fragmented (ENP > 3.5), Triangular (2.5-3.5), Bipolar (< 2.5)
def frag_cat(e):
    if e > 3.5: return 'Highly Fragmented'
    if e > 2.5: return 'Triangular'
    return 'Bipolar'

rsp_full['FragCategory'] = rsp_full['ENP'].apply(frag_cat)

frag_impact = rsp_full.groupby(['IsUrban', 'FragCategory']).agg(
    AvgRank=('Rank', 'mean'),
    AvgShare=('VoteShare', 'mean'),
    Seats=('UniqueConstID', 'count')
).reset_index()

print("Fragility of RSP Position by Contest Type:")
print(frag_impact)

# Scenario 2: National Wave vs Local Bottleneck
# Urban seats where RSP got >15% share but Rank was 4 or worse (High competition bottleneck)
bottlenecks = rsp_full[(rsp_full['IsUrban'] == True) & (rsp_full['VoteShare'] > 15) & (rsp_full['Rank'] >= 4)]
print("\nUrban Bottlenecks (High Share but Low Rank):")
print(bottlenecks[['UniqueConstID', 'VoteShare', 'Rank', 'ENP']])

Fragility of RSP Position by Contest Type:
   IsUrban       FragCategory   AvgRank   AvgShare  Seats
0    False            Bipolar  3.764706   2.995349     34
1    False  Highly Fragmented  5.526316   6.230564     19
2    False         Triangular  3.975610   8.399058     41
3     True            Bipolar  2.666667  23.172554      3
4     True  Highly Fragmented  2.588235  20.573607     17
5     True         Triangular  3.117647  14.206648     17

Urban Bottlenecks (High Share but Low Rank):
Empty DataFrame
Columns: [UniqueConstID, VoteShare, Rank, ENP]
Index: []


  enp_data = df.groupby('UniqueConstID').apply(get_enp).reset_index(name='ENP')


### 14: Analysis

This analysis explores how structural changes in the electoral landscape (fragmentation vs. consolidation) might impact the **Rastriya Swatantra Party (RSP)** across different constituency types.

#### 1. Impact of Reduced Opposition Fragmentation

If traditional parties consolidate (e.g., through more effective alliances or mergers), the "Highly Fragmented" contests shift toward "Bipolar" or "Triangular" ones.

* **Urban Risk:** In **Urban** areas, the RSP currently thrives in **Highly Fragmented** seats (Avg. Rank **2.59**). If these seats consolidate into Bipolar contests, the RSP faces a tougher "head-to-head" challenge. However, the data shows that in existing **Urban Bipolar** seats, the RSP already maintains a high vote share (**23.17%**) and a strong rank (**2.67**).
* **Rural Risk:** In **Rural** areas, fragmentation is currently not helping the RSP (Avg. Rank **5.53**). Consolidation in rural areas would likely further marginalize the RSP unless they significantly grow their baseline share beyond the current **3-6%**.

#### 2. Voter Consolidation Scenarios

* **Where RSP Benefits:** The party benefits most from consolidation in **Semi-Urban and Major Urban Hubs**. In these areas, they are already the 3rd force in Triangular races (Avg. Rank **3.12**). If the "anti-incumbency" vote currently split between smaller parties and independents consolidates behind the RSP, they would immediately jump to 1st or 2nd place.
* **Where RSP Loses:** In **Bipolar Rural** seats. Here, the competition is a zero-sum game between two traditional giants. Without a fragmented field to siphon votes from, the RSP's path to growth is structurally blocked by entrenched two-party loyalty.

#### 3. National Wave vs. Local Realities

National popularity (the "Urban Wave") fails to translate locally in two specific conditions:

* **Low Media/Infrastructure Districts:** The data shows a stark drop-off in vote share (from **23% in Urban** to **3% in Rural Bipolar**). This indicates that the "brand wave" is effectively blocked by the lack of local organizational infrastructure and the dominance of traditional patronage networks in rural areas.
* **Multi-Cornered Urban Bottlenecks:** In highly competitive urban centers like **Kathmandu-1**, even a high vote share (**15.7%**) results only in a **3rd rank** because the competition is split between multiple heavyweights. In these "bottleneck" seats, national popularity is high, but the local "winning threshold" is raised by the quality of the other candidates.

#### Sensitivity Summary Table

| Scenario | Impact on Urban Seats | Impact on Rural Seats |
| --- | --- | --- |
| **Opposition Consolidation** | **High Stress:** Competition becomes harder, requires higher individual candidate strength. | **Irrelevant:** RSP is currently too far behind for consolidation to change their status. |
| **Voter Consolidation (Pro-RSP)** | **Winning Surge:** 3rd place seats flip to 1st. | **Incremental Growth:** Shifts from 5th to 4th or 3rd. |
| **Fragmented All-Party Contest** | **Advantage RSP:** Brand clarity allows them to win with smaller pluralities. | **Chaos:** High volatility, but unlikely to yield RSP wins without organizational growth. |

**Conclusion:** The RSP is most sensitive to the **Urban-Fragmented** environment. Their success in 2026 will likely depend on maintaining that fragmentation among traditional rivals while consolidating the "independent" and "change" voters under their own banner.

### 15. Representation Balance & Boundary Effects (Careful Framing)
	-	Compare:
    	-	Votes per MP in urban vs rural constituencies
	-	Identify potential vote dilution patterns
	-	Could there be intentional gerrymandering?

In [16]:
df = pd.DataFrame(data)
df['UniqueConstID'] = df['DistrictName'] + "_" + df['SCConstID'].astype(str)

# 1. Base Stats
const_stats = df.groupby('UniqueConstID')['TotalVoteReceived'].sum().reset_index(name='TotalVotesInSeat')
df = df.merge(const_stats, on='UniqueConstID')

# Ranks
df['Rank'] = df.groupby('UniqueConstID')['TotalVoteReceived'].rank(ascending=False, method='min').astype(int)

# 2. Categories (Same as Task 11)
valley_districts = ['काठमाडौं', 'ललितपुर', 'भक्तपुर']
major_urban_districts = ['कास्की', 'चितवन', 'मोरङ्ग', 'सुनसरी', 'पर्सा', 'रुपन्देही', 'बाँके']
semi_urban_districts = ['झापा', 'मकवानपुर', 'दाङ', 'कैलाली', 'काभ्रेपलाञ्चोक', 'नवलपरासी (बर्दघाट सुस्ता पूर्व)', 'नवलपरासी (बर्दघाट सुस्ता पश्चिम)', 'कञ्चनपुर']

def categorize_district(dist):
    if dist in valley_districts: return 'Kathmandu Valley'
    if dist in major_urban_districts: return 'Major Urban Hubs'
    if dist in semi_urban_districts: return 'Semi-Urban'
    return 'Rural'

df['GeographicCategory'] = df['DistrictName'].apply(categorize_district)

# 3. Representation Balance
# Each seat has 1 MP. So "Votes per MP" is just the TotalVotesInSeat.
rep_balance = df[['UniqueConstID', 'GeographicCategory', 'TotalVotesInSeat']].drop_duplicates()
rep_summary = rep_balance.groupby('GeographicCategory')['TotalVotesInSeat'].agg(['mean', 'min', 'max', 'std', 'count']).reset_index()

# 4. RSP Vote Efficiency
rsp_df = df[df['PoliticalPartyName'] == 'राष्ट्रिय स्वतन्त्र पार्टी'].copy()
rsp_summary = rsp_df.groupby('GeographicCategory').agg(
    TotalRSPVotes=('TotalVoteReceived', 'sum'),
    SeatsWon=('Rank', lambda x: (x == 1).sum()),
    SeatsContested=('UniqueConstID', 'count')
).reset_index()

# Merge
final_analysis = rep_summary.merge(rsp_summary, on='GeographicCategory')
final_analysis.rename(columns={'mean': 'AvgVotesPerMP'}, inplace=True)

# Calculation: Votes needed for one RSP seat in each region
final_analysis['RSP_VotesPerWin'] = final_analysis['TotalRSPVotes'] / final_analysis['SeatsWon']
# Calculation: RSP votes that resulted in no representation (Wasted Votes)
final_analysis['WastedVotes'] = final_analysis.apply(lambda x: x['TotalRSPVotes'] if x['SeatsWon'] == 0 else 0, axis=1)

print("Representation Balance Summary:")
print(final_analysis)

# 5. Identifying potential malapportionment or boundary effects
# Look for constituencies with very high/low total votes
extrema = rep_balance.sort_values(by='TotalVotesInSeat')
print("\nSeats with Lowest Total Votes (Smallest Electorates):")
print(extrema.head(5))
print("\nSeats with Highest Total Votes (Largest Electorates):")
print(extrema.tail(5))

Representation Balance Summary:
  GeographicCategory  AvgVotesPerMP    min    max           std  count  \
0   Kathmandu Valley   46277.200000  26166  71709  12251.914598     15   
1   Major Urban Hubs   73133.285714  47156  92202  12880.177768     28   
2              Rural   60671.821782   4827  97981  16587.093467    101   
3         Semi-Urban   77057.809524  56710  99225  11602.743407     21   

   TotalRSPVotes  SeatsWon  SeatsContested  RSP_VotesPerWin  WastedVotes  
0         130648         5              13          26129.6            0  
1         279086         2              24         139543.0            0  
2         220350         0              75              inf       220350  
3         184939         0              19              inf       184939  

Seats with Lowest Total Votes (Smallest Electorates):
      UniqueConstID GeographicCategory  TotalVotesInSeat
1680       मनाङ्ग_1              Rural              4827
2011     मुस्तांग_1              Rural              7

### 15: Analysis

This analysis compares the "value" of a vote across different regions and examines how the distribution of voters impacts the conversion of RSP’s popular support into parliamentary seats.

#### 1. Votes per MP (Seat Weighting)

In Nepal's First-Past-The-Post (FPTP) system, each constituency elects exactly one MP. However, the number of votes cast per constituency varies drastically, meaning an MP from an urban hub represents significantly more voters than an MP from a remote district.

| Geographic Category | Avg. Votes per MP | Electorate Range (Min - Max) |
| --- | --- | --- |
| **Kathmandu Valley** | 46,277 | 26,166 - 71,709 |
| **Major Urban Hubs** | **73,133** | 47,156 - 92,202 |
| **Semi-Urban** | **77,058** | 56,710 - 99,225 |
| **Rural** | 60,672 | 4,827 - 97,981 |

* **Observation:** The "cost" of a seat in terms of voter numbers is highest in **Semi-Urban** and **Major Urban Hubs**. On average, an urban MP represents ** more voters** than an MP from the Kathmandu Valley (which has many small-electorate seats) or certain rural pockets.

#### 2. RSP Vote Dilution (Wasted Vote Analysis)

Vote dilution occurs when a party receives a high volume of votes that do not result in a seat.

* **Urban Efficiency:** In the **Kathmandu Valley**, RSP's seat conversion was highly efficient. They won 5 seats with a total of 130,648 votes, meaning they "spent" roughly **26,130 votes per MP**.
* **Major Urban Dilution:** In **Major Urban Hubs** (e.g., Pokhara, Biratnagar, Butwal), RSP received **279,086 votes** but won only 2 seats. The "cost" per seat jumped to **139,543 votes per MP**—nearly 5 times higher than in the Valley.
* **Rural and Semi-Urban Trap:** In Rural and Semi-Urban areas, the RSP received a combined **405,289 votes** but won **zero seats**. These 400k+ votes are "structurally trapped" and did not translate into direct parliamentary representation.

#### 3. Boundary Effects & Potential Gerrymandering

While the data does not explicitly prove intentional "gerrymandering" (redrawing boundaries for political gain), it does highlight **Structural Malapportionment**:

* **Smallest Electorates:** Remote districts like **Manang (4,827 votes)** and **Mustang (7,170 votes)** have significantly more representation per capita than urban districts.
* **Largest Electorates:** Constituencies like **Kavre-2 (99,225 votes)** and **Jhapa-5 (93,870 votes)** require almost **20 times more votes** to elect one MP compared to Manang.
* **Impact on RSP:** Since the RSP’s support is concentrated in large, high-turnout urban and semi-urban constituencies, the current boundary distribution structurally disadvantages them. Their votes are "diluted" in massive urban electorates, while traditional parties benefit from winning smaller, low-turnout rural seats with far fewer absolute votes.

#### Summary:

The RSP is currently a victim of **urban vote concentration**. To counter this without a boundary commission overhaul, the party must either:

1. Increase its margin in large urban hubs to overcome the high winner-threshold.
2. Specifically target smaller-electorate "High Opportunity" rural seats (identified in Task 13) where a smaller absolute number of votes can secure a seat.