### Weather Notebook

This is the notebook for the assessment for Computer Infrastructure

# Author: Grace Mary Smyth

# Lecturer: Ian McLoughlin

## Purpose
The purpose of the assessment is for you to demonstrate ability in the following.

Use, configure, and script in a command line interface environment.

Manipulate and move data and code using the command line.

Compare commonly available software infrastructures and architectures.

Select appropriate infrastructure for a given computational task.

The assessment consists of three overlapping parts: a GitHub repository containing all your work (20%), a series of tasks (40%), and a small project (40%).

https://github.com/ianmcloughlin/2425_computer_infrastructure

## Repository (20%)

A repositry was created on Github called computer_infrastructure_assessment. This repositry contains all the elements that was requested by the assessment outline. 

I first included a .gitignore file. In this I included files to ignore. In this I added the python, macOS and windows gitignore from the gitignore templates on github. 

A README.md was added

## Task 1
A directory structure was creaded using the command line. A folder/directory called data was created. Inside data two sub directories were created: timestamps and weather. This was completed using the mkdir command on the command line

## Task 2
Using the date command on the command line, I outputted the current date and time. Then typing date>>now.txt I outputted the current date and time to the now.txt folder 

## Task 3
In this section I formatted timestamps. This is paticularly useful if using the datestamp as a file name. This makes it easier for the computer to read. When using the datestamp as a file name dont use / or: as these characters can be used as commands on the command line. For this section I inputted on the command line date +"%Y%m%d_%H%M%S" This outputted the date in the following format YYYYMMDD_HHMMSS. By running date +"%Y%m%d_%H%M%S">>formatted.txt I committed the output to a new file called formatted.txt. 

## Task 4

For this task I no longer used the >> or > command. I created an empty file using the touch command. But as I required it to be called the date at the time I embedded the date command into the touch command using ` (backticks).  

On the command line:
 touch `date +"%Y%m%d_%H%M%S.txt"`. This gave me an empty text file called 20241127_100403.txt. 

## Task 5
This task involved downloading the weather data for the Athenry weather station from Met Eireann. The URL specified by the assignment is:
https://prodapi.metweb.ie/observations/athenry/today

Firstly I navigated to the data folder and then the weather folder using the cd command. The weather folder was empty. This was assessed by using the ls command. Using the wget command and the URL the weather from the Athenry weather station was saved as a JSON file in the weather folder under "today" (Note: When copy and pasting from the URL in the assignment it added the full stop. I didnt identify initially and got a 404 Not found error code). "Today" is not a good/useful name for a file. It makes it difficult if requesting the same information regularly which we will want further in the assignment. It would make more sense to download the data with a timestamp. This would make the files easier to identify. 

On the command line:
wget https://prodapi.metweb.ie/observations/athenry/today 
```





## Task 6
As identified in Task 5 the file I requested saves into the weather folder/directory as "Today" which if we are requesting data regularly is not a useful filename. This task is to modify Task 5 and save as a timestamped name. ie combine Task 5 with Task 3. This can be done with -0 (as per the manual page for wget) and imbeddding the date command using `(backticks)

On the command line: 
```wget -O `date +"%Y%m%d_%H%M%D.json"` https://prodapi.metweb.ie/observations/athenry/today
```
At this point I cleanned up my repository using rm to remove unwanted files. I renamed a file by using ren oldfilename newfilename.

## Task 7
Automating Task 6 

In the root of the repository I added a file called weather.sh. The sh stands for shell. On the first line I inputted 
#! /bin/bash
This tells the programme to run this on the command line environment


```date
echo "Downloading weather data..."
wget -O data/weather/$(date +"%Y%m%d_%H%M%S").json https://prodapi.metweb.ie/observations/athenry/today
echo "Weather data downloaded."
date
```
this saves the output to the correct folder and also gives a nice output on the command line 

## Task 8
This Notebook

## Task 9

Task 9 is in this notebook, use the  pandas function read_json() to load in any of the weather files downloaded previously. Examine and summarize the data. Use the information provided data.gov.ie to write a short explanation of what the data set contains. 

Firstly, import librarys.

In [2]:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import json


load the weather data

In [3]:
# List all files in the directory
files = os.listdir('/workspaces/computer_infrastructure_assessment/data/weather')

# Filter out JSON files
json_files = [file for file in files if file.endswith('.json')]

# Check if there are any JSON files
if json_files:
	# Load the first JSON file found
	with open(f"/workspaces/computer_infrastructure_assessment/data/weather/{json_files[0]}", "r") as f:
		data = json.load(f)
	print(f"Loaded file: {json_files[0]}")
else:
	print("No JSON files found in the directory.")

Loaded file: 20241129_141156.json


## Examining the data

In [4]:
print(data)

[{'name': 'Athenry', 'temperature': '13', 'symbol': '02n', 'weatherDescription': 'Fair', 'text': '"Fair"', 'windSpeed': '19', 'windGust': '46', 'cardinalWindDirection': 'S', 'windDirection': 180, 'humidity': ' 77 ', 'rainfall': ' 0.0 ', 'pressure': '1014', 'dayName': 'Friday', 'date': '29-11-2024', 'reportTime': '00:00'}, {'name': 'Athenry', 'temperature': '13', 'symbol': '02n', 'weatherDescription': 'Fair', 'text': '"Fair"', 'windSpeed': '30', 'windGust': '-', 'cardinalWindDirection': 'S', 'windDirection': 180, 'humidity': ' 80 ', 'rainfall': ' 0.0 ', 'pressure': '1013', 'dayName': 'Friday', 'date': '29-11-2024', 'reportTime': '01:00'}, {'name': 'Athenry', 'temperature': '13', 'symbol': '04n', 'weatherDescription': 'Cloudy', 'text': '"Cloudy"', 'windSpeed': '24', 'windGust': '43', 'cardinalWindDirection': 'S', 'windDirection': 180, 'humidity': ' 81 ', 'rainfall': ' 0.0 ', 'pressure': '1013', 'dayName': 'Friday', 'date': '29-11-2024', 'reportTime': '02:00'}, {'name': 'Athenry', 'temper

In [6]:
print(type(data))

<class 'list'>


First convert the list to a pandas DataFrame then print the info of the DataFrame

In [8]:
# Convert the list to a pandas DataFrame
df = pd.DataFrame(data)

# Print the info of the DataFrame
print(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14 entries, 0 to 13
Data columns (total 15 columns):
 #   Column                 Non-Null Count  Dtype 
---  ------                 --------------  ----- 
 0   name                   14 non-null     object
 1   temperature            14 non-null     object
 2   symbol                 14 non-null     object
 3   weatherDescription     14 non-null     object
 4   text                   14 non-null     object
 5   windSpeed              14 non-null     object
 6   windGust               14 non-null     object
 7   cardinalWindDirection  14 non-null     object
 8   windDirection          14 non-null     int64 
 9   humidity               14 non-null     object
 10  rainfall               14 non-null     object
 11  pressure               14 non-null     object
 12  dayName                14 non-null     object
 13  date                   14 non-null     object
 14  reportTime             14 non-null     object
dtypes: int64(1), object(14)
m

There are 14 entries in the DataFrame 0-13 (15 columns). This Dataframe looks at weather on a given day and time (Updated hourly) under the following parameters: <class 'pandas.core.frame.DataFrame'>
RangeIndex: 14 entries, 0 to 13
Data columns (total 15 columns):
 #   Column                 Non-Null Count  Dtype 
---  ------                 --------------  ----- 
 0   name                   14 non-null     object
 1   temperature            14 non-null     object
 2   symbol                 14 non-null     object
 3   weatherDescription     14 non-null     object
 4   text                   14 non-null     object
 5   windSpeed              14 non-null     object
 6   windGust               14 non-null     object
 7   cardinalWindDirection  14 non-null     object
 8   windDirection          14 non-null     int64 
 9   humidity               14 non-null     object
 10  rainfall               14 non-null     object
 11  pressure               14 non-null     object
 12  dayName                14 non-null     object
 13  date                   14 non-null     object
 14  reportTime             14 non-null     object
dtypes: int64(1), object(14)
memory usage: 1.8+ KB
None



## Index of commands used on the command line

 -  pwd : present working directory
 -  cd  : change directory
 -  ..  : move to parent folder
 -  ls  : list files in current folder
 -  ll  : long list in folder/directory
 -  clear: Clear screen
 -  rm   : remove files
 -  ren or rename: to rename files or folders. Type ren oldfilename newfilename
 -  >>   : redirects command to another file location. APPENDS FILE. Does not overwrite
 -  >    : **OVERWRITES FILE!** use with caution. Cannot be undone
 -  <    : Sends the contents of the file to the programme
 -  cat  : Concatenate. Read and display the contents of the file
 -  CTRL D : sends an end of file message
 -  CTRL C  : Kills the command
 -  |     : (vertical line) called pipe. used to pipe or transfer the output from the command on its left into the standard input of the command on its right
 -  grep  : looks for and matches text patterns within files
 -  free -h: tells how much free RAM is on the system and is human readable
 -  `      : (Backtick) this is used to embed commands into other commands
 -  touch  : create a single empty file 
 -  wget   : used to retrieve files from the internet via HTTP, HTTPS and FTP protocols. Non interactive network downloader
 -  which  : type in a command and it tells the user where the file is.
 -  echo   : print command

## References:

https://github.com/ianmcloughlin/2425_computer_infrastructure

https://www.ibm.com/docs/en/aix/7.3?topic=directories-creating-mkdir-command

https://www.hostinger.com/tutorials/linux-commands?utm_campaign=Generic-Tutorials-DSA|NT:Se|LO:Other-EU&utm_medium=ppc&gad_source=1&gclid=Cj0KCQiAi_G5BhDXARIsAN5SX7oV8YyiZMWVi6Bm-P2IaQ6n6OCLxsFQ_EDxyvKcJ_trvSC-qPB9OHoaAqfmEALw_wcB

https://www.geeksforgeeks.org/linux-commands-cheat-sheet/

https://www.geeksforgeeks.org/wget-command-in-linux-unix/

manual page for wget

https://tecadmin.net/wget-download-files-to-specific-directory/


### End