# Task 8: Weather Analysis
### Author: Aoife Flavin

The purpose of this notebook is to explain how I completed Tasks 1-7 of the weekly tasks assigned to me in the Module Computer Infrastructure, in Semester 2 of the Higher Diploma in Data Analytics at ATU.

In the second half of this notebook I will use Pandas to read and analyse a file that previousl created during the module which contains weather data for Athenry.

## Part 1: Task Description

#### Task 1: Create Directory Structure
*Using the command line, create a directory (that is, a folder) named data at the root of your repository. Inside data, create two subdirectories: timestamps and weather.*

To complete the first task I began by opening the root of my repository on GitHub Codespaces I then performed the following tasks in the command line:

```
mkdir data
```
This creates a data directory


```
cd data
```
Navigate to data directory

```
mkdir timestamps
```
Creates the timestamps directory

```
mkdir weather
```
Creates the weather directory

```
cd timestamps
```
Navigates to the timestamps directory

```
touch testing.txt
```
Creates an empty text file in the time stamps directory. This is done because you cannot push an empty directory to GitHub.

```
cd ..

cd weather
```
Navigate to weather directory

```
touch testing.txt
```
Creates an empty text file in the weather directory

#### Task 2: Timestamps
*Navigate to the data/timestamps directory. Use the date command to output the current date and time, appending the output to a file named now.txt. Make sure to use the >> operator to append (not overwrite) the file. Repeat this step ten times, then use the more command to verify that now.txt has the expected content.*

To complete the second task I began by opening the root of my repository on GitHub Codespaces I then performed the following tasks in the command line:

```
cd data/timestamps
```
Navigate to the timestamps directory

```
date >> now.txt
```
This creates the file now.txt and appends the current date and time to the text file. I ran this 10 times.

```
more now.txt
```
I then used the 'more' command to view the contents of now.txt


#### Task 3: Formatting Timestamps
*Run the date command again, but this time format the output using YYYYmmdd_HHMMSS (e.g., 20261114_130003 for 1:00:03 PM on November 14, 2026). Refer to the date man page (using man date) for more formatting options. (Press q to exit the man page). Append the formatted output to a file named formatted.txt.*

To complete the third task I began by opening the root of my repository on GitHub Codespaces I then performed the following tasks in the command line:

```
cd data/timestamps
```
Navigate to the timestamps directory

```
man date
```
This allowed me to see the manual for the date function and decide how to format the date

```
q
```
Exit manual page

```
date +"%Y%m%d_%H%M%S" >> formatted.txt
```
This generates the current date and time in the specified format (YYYYmmdd_HHMMSS) and appends it to a file named formatted.txt. 

#### Task 4: Create Timestamped Files
*Use the touch command to create an empty file with a name in the YYYYmmdd_HHMMSS.txt format. You can achieve this by embedding your date command in backticks ` into the touch command. You should no longer use redirection (>>) in this step.*

To complete the fourth task I began by opening the root of my repository on GitHub Codespaces I then performed the following tasks in the command line:

```
cd data/timestamps
```
Navigate to the timestamps directory

```
touch $(date + "%Y%m%d_%H%M%S").txt
```
 This command creates an empty file with a name that includes the current date and time in the format YYYYmmdd_HHMMSS.txt.
- The $ is used for command substitution, which executes the date command and captures its output.
- The date +"%Y%m%d_%H%M%S" command generates a timestamp formatted as YYYYmmdd_HHMMSS.
- The touch command then uses this formatted timestamp as the filename, creating an empty file named something like 20241215_160000.txt.

#### Task 5: Download Today's Weather Data
*Change to the data/weather directory. Download the latest weather data for the Athenry weather station from Met Eireann using wget. Use the -O < filename > option to save the file as weather.json. The data can be found at this URL:
https://prodapi.metweb.ie/observations/athenry/today.*

To complete the fifth task I began by opening the root of my repository on GitHub Codespaces I then performed the following tasks in the command line:

```
cd data/weather
```
Navigate to the weather directory

```
wget -O weather.json https://prodapi.metweb.ie/observations/athenry/today.
```
The 'wget' command is used to download the weather data from the url and th '-O' specifies the name of the file where the data is to be saved, in this case in 'weather.json'

#### Task 6: Timestamp the Data
*Modify the command from Task 5 to save the downloaded file with a timestamped name in the format YYYYmmdd_HHMMSS.json.*

To complete the sixth task I began by opening the root of my repository on GitHub Codespaces I then performed the following tasks in the command line:

```
cd data/weather
```
Navigate to the weather directory

```
wget -O $(date + "%Y%m%d_%H%M%S").json https://prodapi.metweb.ie/observations/athenry/today.
```
This command downloads the latest weather data for the Athenry weather station from Met Éireann using wget. 
The -O option specifies the output filename. The filename is generated using the current date and time formatted as YYYYmmdd_HHMMSS.json, making sure each download has a unique name based on when it was executed.

#### Task 7: Write the Script
*Write a bash script called weather.sh in the root of your repository. This script should automate the process from Task 6, saving the weather data to the data/weather directory. Make the script executable and test it by running it.*

To complete the seventh task I began by navigating to the root of my repository and creating a file called weather.sh
In this file I entere dthe following script:
```
#!/bin/bash

wget -O data/weather/$(date +"%Y%m%d_%H%M%S").json https://prodapi.metweb.ie/observations/athenry/today
```
This script takes the command from the previous task and specifies that the location should be in the weather folder

```
ls -al
```
In the command line I used this command to investiate the permissions of my files and found that the file weather.sh was not executable.

```
chmod u+x ./weather.sh
```
This command makes the file weather.sh executable

```
./weather.sh
```
This runs my script in weather.sh. When run this script creates a file with the weather data in a Json file with the name of the file the exact date and time the script was run.

## Part 2: Weather Analysis

In [3]:
#Data Frames
import pandas as pd

In [5]:
df = pd.read_json('data/weather/20241210_142212.json')

In [None]:
#Look at the first few rows of data
df.head()

Unnamed: 0,name,temperature,symbol,weatherDescription,text,windSpeed,windGust,cardinalWindDirection,windDirection,humidity,rainfall,pressure,dayName,date,reportTime
0,Athenry,1,02n,Fair,"""Fair""",9,-,NE,45,89,0,1042,Tuesday,2024-10-12,00:00
1,Athenry,1,04n,Cloudy,"""Cloudy""",7,-,NE,45,87,0,1042,Tuesday,2024-10-12,01:00
2,Athenry,2,02n,Fair,"""Fair""",15,-,E,90,87,0,1041,Tuesday,2024-10-12,02:00
3,Athenry,-1,02n,Fair,"""Fair""",2,-,NW,315,92,0,1041,Tuesday,2024-10-12,03:00
4,Athenry,-3,02n,Fair,"""Fair""",2,-,S,180,93,0,1041,Tuesday,2024-10-12,04:00


In [7]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15 entries, 0 to 14
Data columns (total 15 columns):
 #   Column                 Non-Null Count  Dtype         
---  ------                 --------------  -----         
 0   name                   15 non-null     object        
 1   temperature            15 non-null     int64         
 2   symbol                 15 non-null     object        
 3   weatherDescription     15 non-null     object        
 4   text                   15 non-null     object        
 5   windSpeed              15 non-null     int64         
 6   windGust               15 non-null     object        
 7   cardinalWindDirection  15 non-null     object        
 8   windDirection          15 non-null     int64         
 9   humidity               15 non-null     int64         
 10  rainfall               15 non-null     int64         
 11  pressure               15 non-null     int64         
 12  dayName                15 non-null     object        
 13  date   

In [8]:
df.describe()

Unnamed: 0,temperature,windSpeed,windDirection,humidity,rainfall,pressure,date
count,15.0,15.0,15.0,15.0,15.0,15.0,15
mean,-0.533333,5.666667,66.0,91.066667,0.0,1040.733333,2024-10-12 00:00:00
min,-5.0,2.0,0.0,82.0,0.0,1039.0,2024-10-12 00:00:00
25%,-3.0,3.0,45.0,87.0,0.0,1040.0,2024-10-12 00:00:00
50%,-1.0,6.0,45.0,92.0,0.0,1041.0,2024-10-12 00:00:00
75%,2.0,7.0,45.0,94.5,0.0,1041.0,2024-10-12 00:00:00
max,4.0,15.0,315.0,97.0,0.0,1042.0,2024-10-12 00:00:00
std,3.181793,3.41565,81.33265,4.589844,0.0,0.798809,


PLAN:
df.head - see the first few rows
df.info - summary
df.describe

Convert numeric columns from strings to numbers

unique weather descriptions / unusual data

Average Temp, humidity, pressure




