# <center>***Exploratory Analysis of Today's Weather Athenry***<center>


*********************************


![WEATHER](https://news.lk/media/k2/items/cache/1ae234329621819d5d02e7b6f1f5eb14_XL.jpg)

##### Author: Gabriela Domiciano Avellar
##### Project: Computer Infrastructure 




This Notebook demonstrates using command-line tools, scripting, and automation to manage and process data. The tasks involved creating directory structures, collecting and timestamping weather data from Today's Weather Athenry(https://data.gov.ie/dataset/todays-weather-athenry), and automating the process using Bash scripts.

The final step integrated GitHub Actions to schedule daily weather data collection and push updates to a GitHub repository.

******************************


## Weather Data Analysis Notebook

This notebook documents the steps completed over the weeks to accomplish Tasks 1 through 7. The goal is to provide a clear summary of the progress and tools used in each task, serving as a reference for the weather data analysis project.

Below, each task is briefly explained, with a description of the commands and methods applied to achieve its objectives.

#### Task 1: Creating Directories
- Using the command line, I created a `data` directory at the root of the repository. Inside `data`, I added two subdirectories: `timestamps` and `weather`.
- Verified my location with `ls`, navigated to the parent directory using `cd ..`, and created the folders using `mkdir`.

#### Task 2: Recording Timestamps
- Navigated to the `data/timestamps` directory and created a file named `now.txt`.
- Used the `date` command to append the current date and time to the file without overwriting it. The command used:  
  `date +"%Y%m%d_%H%M%S" >> now.txt`  
- Repeated this step more times and verified the content using the `more` command.

#### Task 3: Formatting Timestamps
- Used the `date` command to format the current date and time in different styles and appended the results to `formatted.txt`. The formats used include:
  1. `20241119_211053`: `YYYYmmdd_HHMMSS`
  2. `19-Nov-2024 21:21:47`: `dd-MMM-yyyy HH:mm:ss`
  3. `11/19/2024 09:22:08 PM`: `MM/dd/yyyy hh:mm:ss AM/PM`
  4. `46 2024`: `WW yyyy` (Week number and year)

#### Task 4: Creating a File with a Timestamped Name
- Created an empty file using the `touch` command with the `date` command embedded to generate a timestamped filename.
- Example command:  
  ```bash
  touch `date +"%Y%m%d_%H%M%S"`.txt.

#### Task 5 - Download Today's Weather Data
- I navigated to the `data/weather` directory and used the `wget` command to download the latest weather data for the Athenry weather station from Met Eireann. I used the `-O` option to save the data as `weather.json`. The data was retrieved from the following URL:https://prodapi.metweb.ie/observations/athenry/today


#### Task 6 - Timestamp the Data
- I modified the previous command to save the downloaded weather data with a timestamped filename in the format `YYYYmmdd_HHMMSS_athenry.json`. To do this, I embedded the `date` command within the `wget` command to dynamically generate the filename based on the current date and time.


#### Task 7 - Write the Script
- I created a bash script called `weather.sh` that automates the process of downloading weather data for Athenry. Here's what the script does:

1. **Prints the current date and time** using `date`.
2. **Displays a message** saying "downloading weather data".
3. **Downloads the weather data** using `wget`, saving the file in the `data/weather` directory with a timestamped filename in the format `YYYYmmdd_HHMMSS_athenry.json`.
4. **Prints a message** confirming that the weather data was downloaded.
5. **Prints the current date and time again** after the download is complete.


#### Task 8 - Notebook called weather.ipynb
- The tasks 01 to 07 were completed by utilizing a combination of bash commands like cd, wget, date, and bash scripting. In the file weather.ipynb you can find a brief report explaining how all the tasks was  completed.



## Let's have a look in the Dataset - '20241204_101703.json'

In [87]:
# Import required packages:
import pandas as pd
import datetime as dt 

In [88]:
# Read the Data Collected 19 November 20124
weather_df = pd.read_json('data/weather/20241204_101703.json')
weather_df.head()

Unnamed: 0,name,temperature,symbol,weatherDescription,text,windSpeed,windGust,cardinalWindDirection,windDirection,humidity,rainfall,pressure,dayName,date,reportTime
0,Athenry,2,15n,Fog / Mist,"""Fog thinning""",2,-,S,180,98,0,1024,Wednesday,2024-04-12,00:00
1,Athenry,2,15n,Fog / Mist,"""Fog thickening""",-,-,,0,99,0,1024,Wednesday,2024-04-12,01:00
2,Athenry,2,15n,Fog / Mist,"""Fog""",4,-,E,90,99,0,1023,Wednesday,2024-04-12,02:00
3,Athenry,1,15n,Fog / Mist,"""Fog thinning""",6,-,E,90,99,0,1023,Wednesday,2024-04-12,03:00
4,Athenry,1,15n,Fog / Mist,"""Recent Fog""",4,-,E,90,99,0,1022,Wednesday,2024-04-12,04:00


In [89]:
# Print the shape, number of rows and columns in the DataFrame.
print(weather_df.shape)

(11, 15)


In [90]:
# Checks for missing values ​​in any column of DataFrame 'iris_df'
missing_values = weather_df.isna().any()
print(missing_values)

name                     False
temperature              False
symbol                   False
weatherDescription       False
text                     False
windSpeed                False
windGust                 False
cardinalWindDirection    False
windDirection            False
humidity                 False
rainfall                 False
pressure                 False
dayName                  False
date                     False
reportTime               False
dtype: bool


In [91]:
weather_temperature = weather_df[['temperature', 'windSpeed', 'humidity', 'rainfall']]
print(weather_temperature)

    temperature windSpeed  humidity  rainfall
0             2         2        98         0
1             2         -        99         0
2             2         4        99         0
3             1         6        99         0
4             1         4        99         0
5             1         -        99         0
6             2         4        99         0
7             3         7        99         0
8             5         7        96         0
9             7        17        91         0
10            9        15        93         0


In [92]:
weatherdesc_temperature = weather_df['weatherDescription']
print(weatherdesc_temperature)

0     Fog / Mist
1     Fog / Mist
2     Fog / Mist
3     Fog / Mist
4     Fog / Mist
5     Fog / Mist
6     Fog / Mist
7     Fog / Mist
8     Fog / Mist
9         Cloudy
10        Cloudy
Name: weatherDescription, dtype: object


This is a weather dataset for Athenry, dated 04-12-2024, contains information such as the temperature, which varies between 1°C and 9°C, and the weather is described as Fog/Mist most part of the day and Cloudy. The wind on this day varies between 2 and 15 km/h. The humidity is high, ranging from 91% to 99%. There is no chance of rain.

In [93]:
# Inspect types.
type = weather_df.dtypes
print(type)

name                             object
temperature                       int64
symbol                           object
weatherDescription               object
text                             object
windSpeed                        object
windGust                         object
cardinalWindDirection            object
windDirection                     int64
humidity                          int64
rainfall                          int64
pressure                          int64
dayName                          object
date                     datetime64[ns]
reportTime                       object
dtype: object


In [94]:
# Calculate descriptive statistics, and saves it to a CSV file called 'summary.csv'
file_path = 'data/summary.csv'
weather_df.describe().T.to_csv(file_path)


#### Weather Summary

- **Temperature**: Average of **2.91°C** (min: 1.0, max: 9.0).  
- **Wind Speed**: Average of **7.0 km/h** (min: 0, max: 17.0).  
- **Wind Direction**: Predominantly **E/SE**, with some variation.  
- **Humidity**: Very high, average of **97.36%** (min: 91%, max: 99%).  
- **Rainfall**: None recorded (0.0 mm throughout the dataset).  
- **Pressure**: Average of **1021.73 hPa** (min: 1016, max: 1024).  

The summary shows a calm, foggy morning and a dry and cloudy day.


## Automating Weather Data Collection with GitHub Actions

I automated the `weather.sh` script to run daily and push new weather data to my GitHub repository using GitHub Actions. Below is a summary of the steps I completed:

### Setting Up the GitHub Actions Workflow
- I created a `.github/workflows/` directory in my repository and added a file named `weather-data.yml`, using the mkdir command.

### Scheduling the Workflow
- Using the `schedule` event with a `cron` expression, I configured the workflow to run daily at 10 AM.

### Specifying the Environment
- I used a Linux virtual machine as the runtime environment for the workflow.

### Cloning the Repository
- The workflow includes a step to clone the repository so that it has access to the `weather.sh` script and necessary files.

### Executing the Script
- I added a step to execute the `weather.sh` script, which downloads and saves the latest weather data for Athenry.

### Committing and Pushing Changes
- The workflow is configured to commit the new weather data with a timestamped filename and push the changes back to the repository automatically.

### Testing the Workflow
- After committing and pushing the workflow file to the repository, I tested it using the `workflow_dispatch` event.
- I verified the logs in the Actions tab to ensure the script executed successfully, and the new weather data was correctly committed to the repository.

This automation ensures the weather data is collected daily without manual intervention.


************************************

# End