# Project Reflection (Group 1)

### About the Project

This project focused on analysing Earth’s surface temperature trends and extreme heat events using
climate projection data from 2006 to 2080. The primary objective was to investigate whether Earth’s
temperature is increasing over time, identify days with extreme heat (305K for moderate heat risk and
308K for extreme heat risk), and analyse key factors influencing temperature variations. This included
assessing seasonal changes and differences between extreme and non-extreme heat days. By
understanding these patterns, we aimed to provide insights into climate change impacts, which are
crucial for climate adaptation and mitigation strategies.

### Our Approach and Results

We used a data-driven approach with multiple steps. Data preprocessing involved converting time
data, correcting errors (e.g., FSNS, FLNS, and PRECT cannot be negative), creating new features
(year, month, extreme heat indicators), and standardising values with ‘MinMaxScaler’ for consistency.
Next, exploratory data analysis (EDA) used summary statistics, visualisations, and correlation
matrices to identify trends in monthly and yearly temperature variations. Factor analysis applied
Pearson and Spearman correlations to assess how FSNS, FLNS, QBOT, UBOT, VBOT, PRECT, and
PRSN influence TREFHT across different seasons and between extreme and non-extreme heat days.
For trend analysis, we used time series techniques and rolling averages to confirm year-over-year
warming and identify periods with frequent extreme heat days, helping forecast future occurrences.

Through our analysis, we confirmed that Earth’s surface temperature is increasing over time. The data
exhibited a clear year-over-year warming trend, with a rise in extreme heat days. These findings align
with IPCC projections of global temperature increase. We identified that QBOT (humidity), FSNS
(solar radiation), and FLNS (longwave radiation) are the strongest influencers of temperature. In
summer, FSNS plays a dominant role in increasing heat levels, while in winter, FLNS and wind
components (UBOT/VBOT) have a greater impact. Higher humidity (QBOT) was found to amplify heat
stress, increasing the frequency of extreme heat days. Additionally, extreme heat days are becoming
more frequent, with more days exceeding 305K and 308K observed in the later years of the dataset.
This suggests an increasing risk of heatwaves, health impacts, and environmental changes.
Understanding seasonal differences was crucial in analysing temperature fluctuations, as summer
and winter showed different dominant factors. For example, cloud cover (FLNS) and wind patterns
(UBOT, VBOT) regulate temperature differently across seasons.

### Learning Outcomes

During the practical implementation of this project, the team not only deepened its understanding of climate data analysis but also gained valuable experience in multiple aspects, including data processing, visualization, and time series analysis. In the initial stage of data analysis, we strengthened our ability to use Pandas for data cleaning, including handling missing values, detecting and processing outliers, and identifying duplicate data. We also enhanced our skills in applying Matplotlib and Seaborn for data visualization, including time series plots, histograms, box plots, heatmaps, and density distribution plots. Additionally, we learned fundamental methods of time series analysis, such as plotting time series curves to observe whether variables exhibit significant upward or downward trends, applying Moving Average to reduce short-term fluctuations and smooth data for better long-term trend observation, and classifying time series data by season rather than only analyzing at the annual scale to help identify periodic patterns. Furthermore, when interpreting results, we learned to integrate literature research and domain knowledge to make informed and reasonable inferences. Through this project and literature review, the team enhanced its understanding of scientific research methodology, including research question formulation, literature review, research design, data processing and analysis, result interpretation and discussion, and drawing conclusions.

During the project, we deeply realized the importance of documentation and team collaboration. By writing clear code comments, the code became more readable and maintainable. Properly dividing Markdown explanations and code sections in Jupyter Notebook made the overall project logic more coherent. Managing the project through GitHub improved team collaboration efficiency and standardized version control. When analyzing results, exchanging different perspectives with team members and integrating multiple opinions helped optimize the data analysis approach.

In terms of results reporting, we have realized the importance of ensuring that the presentation content is highly relevant to the report output, clearly demonstrating findings consistent with the report’s conclusions. Additionally, definitions of certain concepts should be more precise and supported by references. For example, an extreme heatwave event is generally defined as a prolonged period (typically several days or longer) of abnormally high temperatures in a specific region. In this study, it would be more appropriate to revise the identification of extreme heatwave events to the number of extreme heat days.

### Challenges Encountered

One of the main challenges we faced was interpreting meteorological interactions. Understanding how
different factors such as FSNS, FLNS, wind, and precipitation interact was complex. Some variables
exhibited non-linear relationships, making simple correlation analysis insufficient for capturing the full
dynamics of temperature variations. Another challenge was comparing seasonal influences across
years. Differentiating the role of temperature drivers between seasons required careful statistical
comparisons.

### Future Improvements

One area for improvement is defining extreme heat thresholds more precisely. While we used 305K
and 308K as standard thresholds, the impact of extreme heat may vary based on regional differences
and human adaptation capacity. Future research could incorporate location-based heat thresholds to
enhance the accuracy of heat stress analysis. Another improvement would be the use of machine
learning models for temperature prediction. Instead of relying solely on correlation analysis, we could
apply regression models such as Random Forest or XGBoost to predict future temperatures based on
climate variables. This would provide more robust and data-driven insights into temperature
forecasting. Additionally, we could expand the analysis to multiple locations. This study focused on a
single location near Manchester, but a broader analysis covering multiple geographic areas could
reveal regional climate differences and improve predictive capabilities. Furthermore, integrating
real-world climate events, such as historical heatwaves and climate anomalies, would help validate
findings and improve model accuracy