# Computer infrastructure - Tasks

**by Grainne Boyle**


This notebook describes the weekly tasks I have completed for the Computer infrastructure module.


**Contents:** 

1. [Task 1](#task-1-create-directory-structure)
2. [Task 2](#task-2-timestamps)
3. [Task 3](#task-3-formatting-timestamps)
4. [Task 4](#task-4-create-timestamped-file)
5. [Task 5](#task-5-download-todays-weather-data)
6. [Task 6](#task-6-timestamp-the-data)
7. [Task 7](#task-7-write-the-script)
8. [Task 8](#task-8-notebook)
9. [Task 9](#task-9-pandas)

Overview](#Overview)
2. [Description](#Description)
3. [Research](#Research)

## Task 1 Create Directory Structure

In code spaces, I created a new directory structure called data using the `mkdir` command. The command stands for  make directory and is used to create a folder within the main directory. I created two sub-directories within data: timestamps and weather,  again using mkdir. To create the parent directory data and the sub-directory timestamps, I used the following command:   ```mkdir -p data/timestamps```.  
The -p flag ensures that if the parent directory data does not exist, it will be created along with the timestamps subdirectory.

## Task 2 Timestamps

In the codespace, I ran the `date` command , which printed the current date and time . To save this output to a file, I used double angle brackets (>>) to redirect and append the output to a new file called now.txt. The command was:  
```date >> now.txt``` 
By running this command ten times, using the double angled brackets , new output was appended to now.txt. The double angle brackets ensure that the command output is added to the file without overwriting it. If I had used a single angled bracket , >, this would over-write the file each time the command is executed. The result of the running this command ten times can be seen in the now.txt file , which shows a list of dates and times. In this case, the dates and hours are the same, but the minutes and seconds differ with each execution.
 

## Task 3 Formatting Timestamps
In the codespace, I formatted the `date` command using `YYYYmmdd_HHMMSS`. In this format, I used upper case letters for the year, hour, minute and second, and lower case for the month and day. You can view the manual for the `date` command by running `man date`, which explains the different format specifiers you can use to customize how the date and time appear. When I entered the following command:  
```date + "YYYYmmdd_HHMMSS" ```
It displayed the formatted date and time in the command line. Next I appended the formatted output date to a file named formatted.txt. This was done using the double angle brackets (>>), like so:  
```date +"YYYYmmdd_HHMMSS" >> formatted.txt```
This created a new text file in my folder, if you open this text file you can see the date and time it was created, formatted as specified.


## Task 4 Create Timestamped File

In the codespace, I used the `touch` command to create a new file with a timestamp file. By enclosing the command in backticks, I capture the output of the `date` command and used it as part of the file name. I used:  
```touch `date +"%Y%m%d_%H%M%S"`.txt ```   
This created an empty file with a timestamp in the format YYYYmmdd_HHMMSS. The `touch` command is used to create a file with a given name if the file doesn't already exist. It can also be used to update a file with a current timestamp if the file already exists.


## Task 5 Download Todays Weather Data

 I opened the Met Eireann website and found the weather data for today. I then used the `wget` command to download the weather:     
 ```bash
wget https://prodapi.metweb.ie/observations/athenry/today
  ```

 The `wget` command allows you to download files from the internet into your active directory. Next I use `wget` with `-O`  and this is used to save the output to a file called weather.json.   

 ```bash
wget -O weather.json https://prodapi.metweb.ie/observations/athenry/today
```

 By opening the weather.json file, you can view the weather data for Athenry in JSON format.



 

## Task 6 Timestamp the Data

 Using the cURL command,  I downloaded the data again and saved it to a file in my directory with a timestamped name in the format.

 ``` curl -o `date +"%Y%m%d_%H%M%S.json"` https://prodapi.metweb.ie/observations/athenry/today```
This saved the athenry data to a file named "20241116__215129.json which is when the file was created 16th November 2024 at 9:51:29pm. If you open the file you can see the downloaded data from the Athenry weather today.


[https://blog.hubspot.com/website/curl-command](https://blog.hubspot.com/website/curl-command) you can send or retrieve data using this command. cURL is short for "client URL". It can be used instead of the wget command. If you were enter the command with the URL, the URL data will print in the terminal. In our case, we have used it to pull the data and save that data to a file with a stamped name in the format YYYYmmdd_HHMMSS




## Task 7 Write the Script

I created a file in my repository called weather.sh. A file with the extension .sh is a shell file or program that can be run in a command line interface. It can be used to automate processes.
The script can command the computer or machine to run the tasks at a scheduled time. The shebang phrase "#!/bin/bash signifies the file is a shell script and must be the first line of the script.
When I load the command used in the previous step to my script but I want to save the output to the directory data/weather/ that that file is in   ``` curl -o data/weather/`date +"%Y%m%d_%H%M%S.json"` https://prodapi.metweb.ie/observations/athenry/today```
You can also use echo"enter string or variable" By using 'echo', strings or variables can be executed. 

Also, I need to change the mode of the file, when I tried to run it is showed permission denied. The command 'chmod' stands for 'change mode', and 'u+x' instructs the system to give the user(u) permissions to execute (x) the file. I entered ```chmod u+x ./weather.sh``` in the command line. Then when I run the program in the command line,


## Task 8 Notebook

I created this jupyter notebook called weather.ipynb. In this notebook, I have written a report explaining how I completed each Task 1 to 7. I explain the commands I use and their roles in completing the task.

## Task 9 Pandas

In this section, we wish to import the data from one of the json file and examine and analyse it.First, I needed to check that python was installed. Then  I import pandas so that I can use the read function. The data in the json file comes from the government website  and contains a list of observations for every hour of the current day for our synoptic station in Athenry, Co Galway. The file is updated hourly. Time values are Local times. The values include: Name; Dry bulb temperature in whole degrees; Weather description, Windspeed (kt); Cardinal Wind Direction; Relative Humidity (%); Rainfall (mm); msl Pressure (mbar); Day of the week; Date; Time of observation. 

In the sections below, I examine and summarise the data. 


In [1]:
!python --version

Python 3.12.1


In [1]:
import pandas as pd
# Pandas provides data structures and analysis tools .
# 
#  I use the function pd.read_json() to read in the json file so I can analyse it

weather_athenry = pd.read_json('data/weather/20241116_215129.json')

# I can review the data using some functions in pandas:

weather_athenry.head(5) # this shows that there was some light rain and it was mainly cloudy for the first 5 hours on Saturday 2024-11-16





Unnamed: 0,name,temperature,symbol,weatherDescription,text,windSpeed,windGust,cardinalWindDirection,windDirection,humidity,rainfall,pressure,dayName,date,reportTime
0,Athenry,11,46n,Light rain,"""Recent Drizzle """,9,-,W,270,99,0.2,1022,Saturday,2024-11-16,00:00
1,Athenry,11,04n,Cloudy,"""Cloudy""",9,-,W,270,98,0.0,1021,Saturday,2024-11-16,01:00
2,Athenry,11,04n,Cloudy,"""Cloudy""",7,-,W,270,97,0.0,1021,Saturday,2024-11-16,02:00
3,Athenry,11,04n,Cloudy,"""Cloudy""",9,-,W,270,99,0.0,1020,Saturday,2024-11-16,03:00
4,Athenry,11,04n,Cloudy,"""Cloudy""",7,-,W,270,99,0.0,1020,Saturday,2024-11-16,04:00


In [None]:
weather_athenry.info() # The in shows us a quick summary of the DataFrame, helping us inspect the structure and contents of the data. THere are 15 columns and 22 rows. 
#There are integer data type columns for temperature, windspeed , wind direction, humidity  and pressure. The column rainfall is of the data type float.
# The date column contains datetime data.


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 22 entries, 0 to 21
Data columns (total 15 columns):
 #   Column                 Non-Null Count  Dtype         
---  ------                 --------------  -----         
 0   name                   22 non-null     object        
 1   temperature            22 non-null     int64         
 2   symbol                 22 non-null     object        
 3   weatherDescription     22 non-null     object        
 4   text                   22 non-null     object        
 5   windSpeed              22 non-null     int64         
 6   windGust               22 non-null     object        
 7   cardinalWindDirection  22 non-null     object        
 8   windDirection          22 non-null     int64         
 9   humidity               22 non-null     int64         
 10  rainfall               22 non-null     float64       
 11  pressure               22 non-null     int64         
 12  dayName                22 non-null     object        
 13  date   

[https://medium.com/@andrewdass/how-to-execute-sh-files-71d8885d8ef3#:~:text=A%20file%20with%20the%20%E2%80%9C.,files%20in%20Unix%20or%20Linux](https://medium.com/@andrewdass/how-to-execute-sh-files-71d8885d8ef3#:~:text=A%20file%20with%20the%20%E2%80%9C.,files%20in%20Unix%20or%20Linux.)