In [5]:
### Project: Analysis of Domestic Residence and Permissions Data

# Developed by: Christiano Ferreira

# Introduction
from IPython.display import Image, display

# Display the image for context
img_url = "https://www.armaxgroup.com.ua/wp-content/uploads/2023/04/ireland-long-stay-visa.jpeg"
display(Image(url=img_url, width=600))

"""This analysis explores a dataset focused on Domestic Residence & Permissions applications and decisions in Ireland. The data spans the years 2017 to 2024, providing valuable insights into the trends and patterns of these applications across different nationalities.

The dataset likely includes details such as the number of applications received, their status (e.g., granted, refused), and potentially other relevant information categorized by nationality. It's important to note that this dataset represents a snapshot at a specific point in time and is subject to future revisions. Additionally, values between 1 and 3 are suppressed for statistical non-disclosure reasons, impacting the completeness of the data.

This analysis aims to:

- **Investigate trends:** Analyze the evolution of application volumes and decision outcomes across the years 2017-2024.
- **Examine nationality variations:** Identify patterns and disparities in application rates and outcomes across different nationalities.
- **Gain insights:** Uncover potential factors influencing application decisions and inform a deeper understanding of the domestic residence and permissions landscape in Ireland.

By exploring this dataset, we aim to contribute to a better understanding of the dynamics surrounding domestic residence and permissions applications in Ireland."""

# Summary of Libraries
"""The following libraries are used in this project:
- **Pandas**: For efficient data manipulation and analysis.
- **Plotly**: To create rich, interactive visualizations.
- **Pycountry Convert**: For converting country names to continents.
- **NumPy**: For numerical operations and data handling.
"""

# Importing Required Libraries
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import pycountry_convert as pc
import numpy as np

# Reading the Data
file_path = r"C:\\Users\\Chris\\Desktop\\PFDA\\project\\Domestic Residence and Permissions.csv"
data = pd.read_csv(file_path, encoding='latin1')

# Data Cleaning and Transformation
# Handling missing data and replacing '*' with NaN
for col in data.columns[3:]:
    data[col] = data[col].replace('*', np.nan).astype(float)

status_filters = ["Received", "Refused", "Granted"]
filtered_data = data[data['Status'].isin(status_filters)]
year_columns = ['2017', '2018', '2019', '2020', '2021', '2022', '2023', '2024']

# Transforming data for analysis
melted_data = filtered_data.melt(
    id_vars=['Type', 'Status', 'Nationality'], 
    value_vars=year_columns, 
    var_name='Year', 
    value_name='Applications'
)
melted_data['Applications'] = pd.to_numeric(melted_data['Applications'], errors='coerce')

# Calculating approval and refusal rates
total_applications = melted_data.groupby(['Year', 'Nationality'])['Applications'].transform('sum')
melted_data['Approval Rate'] = np.where(melted_data['Status'] == 'Granted', melted_data['Applications'] / total_applications, np.nan)
melted_data['Refusal Rate'] = np.where(melted_data['Status'] == 'Refused', melted_data['Applications'] / total_applications, np.nan)

# Descriptive Statistics
print("\nDescriptive Statistics:")
print(melted_data.describe())

# Data Visualization with Explanations

# 1. Line Chart for Trends
"""This line chart helps visualize the overall trend of applications received, granted, and refused over the years, highlighting patterns and shifts in policy or external factors."""
summary = melted_data.groupby(['Year', 'Status']).agg({'Applications': 'sum'}).reset_index()
fig = px.line(summary, x='Year', y='Applications', color='Status', title='Trend of Applications by Year and Status')
fig.show()

# 2. Stacked Bar Chart for Granted, Refused, and Received Applications
"""The stacked bar chart provides a clear comparison of the three key statuses across all years, allowing for easy observation of proportional changes over time."""
granted_received_refused_summary = melted_data.groupby(['Year', 'Status']).agg({'Applications': 'sum'}).reset_index()
fig = px.bar(granted_received_refused_summary, x='Year', y='Applications', color='Status', title='Stacked Bar Chart for Applications')
fig.show()

# 3. Treemap of Applications by Nationality
"""This treemap illustrates the contribution of each nationality to the total applications, making it easier to identify the most represented groups in the dataset."""
treemap_data = melted_data.groupby('Nationality').agg({'Applications': 'sum'}).reset_index()
fig = px.treemap(treemap_data, path=['Nationality'], values='Applications', title='Treemap of Applications by Nationality')
fig.show()

# 4. Line Charts for Top 10 Nationalities by Status Over Time
"""These line charts track the top 10 nationalities for received, granted, and refused applications over time, helping identify patterns specific to different national groups."""
for status in status_filters:
    top_10_status = melted_data[melted_data['Status'] == status].groupby('Nationality')['Applications'].sum().sort_values(ascending=False).head(10).index
    top_10_data = melted_data[(melted_data['Nationality'].isin(top_10_status)) & (melted_data['Status'] == status)]
    fig = px.line(top_10_data, x='Year', y='Applications', color='Nationality', title=f'Top 10 Nationalities for {status} Applications Over Time')
    fig.show()

# 5. Interactive Geographic Map
"""The geographic map highlights where the majority of applications originate, providing a global perspective on the data distribution."""
fig = px.choropleth(melted_data, locations='Nationality', locationmode='country names', color='Applications', hover_name='Nationality', animation_frame='Year', title='Geographic Distribution of Applications Over Time')
fig.show()

# Conclusion
"""Key Insights:
- The volume of applications has fluctuated significantly over the years, reflecting both global migration trends and Ireland's evolving immigration policies.
- Certain nationalities consistently dominate both granted and refused applications, indicating patterns that could be explored further for policy evaluation.
- The correlation between application statuses suggests complex decision-making processes that could benefit from deeper statistical modeling.
- Visualizations, such as the geographic map and line charts, provide essential insights into how application patterns shift over time and by region.

### Future Considerations:
- Incorporating additional variables, such as visa types and socioeconomic indicators, could offer deeper insights into the factors influencing application outcomes.
- Further analysis could focus on year-over-year changes for specific regions to identify migration patterns more effectively.

# References
- Data.gov.ie. (2024). Domestic Residence & Permissions Applications and Decisions by Year and Nationality. Retrieved from: https://data.gov.ie/dataset/domestic-residence-permissions-applications-and-decisions-year-and-nationality
- McKinney, W. (2010). Data Structures for Statistical Computing in Python. Proceedings of the 9th Python in Science Conference, 51–56. Retrieved from: https://pandas.pydata.org/
- Plotly Technologies Inc. (n.d.). Interactive Graphing Library for Python. Retrieved from: https://plotly.com/
- NumPy Developers. (n.d.). NumPy: Fundamental package for scientific computing with Python. Retrieved from: https://numpy.org/
- Pycountry-Convert. (n.d.). Python Library for Country and Continent Conversion. Retrieved from: https://pypi.org/project/pycountry-convert/
- Image Source: Armax Group. (n.d.). Ireland Long Stay Visa Image. Retrieved from: https://www.armaxgroup.com.ua/
"""



Descriptive Statistics:
       Applications  Approval Rate  Refusal Rate
count   4410.000000     723.000000    544.000000
mean      54.104308       0.407647      0.102784
std      207.201219       0.102572      0.086688
min        0.000000       0.000000      0.000000
25%        0.000000       0.352250      0.048306
50%        0.000000       0.400000      0.091997
75%       18.000000       0.453621      0.138151
max     3149.000000       1.000000      1.000000


"Key Insights:\n- The volume of applications has fluctuated significantly over the years, reflecting both global migration trends and Ireland's evolving immigration policies.\n- Certain nationalities consistently dominate both granted and refused applications, indicating patterns that could be explored further for policy evaluation.\n- The correlation between application statuses suggests complex decision-making processes that could benefit from deeper statistical modeling.\n- Visualizations, such as the geographic map and line charts, provide essential insights into how application patterns shift over time and by region.\n\n### Future Considerations:\n- Incorporating additional variables, such as visa types and socioeconomic indicators, could offer deeper insights into the factors influencing application outcomes.\n- Further analysis could focus on year-over-year changes for specific regions to identify migration patterns more effectively.\n\n# References\n- Data.gov.ie. (2024). Domes