# Command-Line Operations and Data Handling

## Overview
This notebook documents a series of command-line tasks designed to practice directory structure creation, timestamp formatting, and data handling. The tasks involve working with timestamps, managing files, and downloading weather data using various shell commands.

Importing libraries:

In [1]:
import pandas as pd 

## Creating **weather.ipynb**
First, I opened Git and set up a workspace in a virtual machine. Using the command line, I created an empty file named `weather.ipynb` with the following command:
``` bash
touch weather.ipynb
```

## Task 1: Creating Directory Structure
I opened a virtual machine and, using the command line in the terminal, created a new directory with the following command:
``` bash
mkdir data
```
Next, I needed to create two subdirectories, timestamps and weather. To do this, I navigated to the data directory:
``` bash
cd data
```
Then, I created the subdirectories using the mkdir command:
``` bash
mkdir timestamps
```

I can try another method of creating a directory with subdirectories:
First, I will remove the data directory using the `rm -r` command. This command removes the directory and its contents, including any subdirectories:
```bash
rm -r data
```
Now, I will use the `mkdir -p` command to create the data directory along with its subdirectories (timestamps and weather). The `-p` option ensures that any necessary intermediate directories are created if they don’t already exist. `mkdir -p`: Creates both the data directory and the subdirectories (timestamps and weather), even if the parent directories don't exist.
```bash
mkdir -p data/timestamps data/weather
```

## Task 2: Timestamps
Navigate to the `data/timestamps` directory:
First, I change to the `data/timestamps` directory:
```bash
cd data/timestamps
```
Use the date command to append the current timestamp:
I use the `date` command with the `>>` operator to append the current date and time to a file named `now.txt`.<br>If `now.txt` does not exist, it will be created automatically:
```bash
date >> now.txt
```
The `>>` operator appends the output to the file.
The `>` operator would overwrite the file, but `>>` ensures that multiple entries are added instead.

Repeat the command 10 times:
I repeat the `date >> now.txt` command 10 times to append 10 different timestamps to the `now.txt` file.

After appending the timestamps, I use the `more` command to display the contents of `now.txt```and confirm that the file has the expected timestamps:
```bash
more now.txt
```

## Git 
To commit and push the changes I type in the command line:
``` bash
git add
git commit <comment>
git push
```

## Task 3: Formatting Timestamps
To learn more about the date command and its options, I opened the manual using:
```bash
man date
```
After exploring the manual, I tried different date and time formats. To generate a timestamp in the format **20241023_092605**, I used the following command:
```bash
date +%Y%m%d_%H%M%S
```
To save this formatted timestamp in a file, appending it to `formatted.txt` located in the `data/timestamps directory`, I used:
```bash
date +%Y%m%d_%H%M%S >> data/timestamps/formatted.txt
```
This command appends the formatted timestamp to the file without overwriting its contents.

## Task 4: Create Timestamped Files

I used the following command to create the file:
```bash
touch `date +%Y%m%d_%H%M%S`.txt
```

## Task 5: Download Today's Weather Data
In this task, I downloaded the latest weather data for the Athenry weather station from Met Éireann using the **wget** command and save it as `weather.json`.

First, I changed to the appropriate directory where I wanted to save the file:
```bash
cd data/weather
```
To download the weather data from **Met Éireann**, I used the `wget` command with the `-O` option to specify the output file name:
```bash
wget -O weather.json https://prodapi.metweb.ie/observations/athenry/today
```
`-O weather.json`: Ensures the downloaded file is saved with the name `weather.json`.
`URL`: The link to the Athenry weather data is https://prodapi.metweb.ie/observations/athenry/today.

After the download is complete, confirm that the file has been saved correctly by listing the files in the `data/weather` directory:
```bash
ls
```

#### Summary
The **wget** command is a tool in Unix-like systems used for downloading files from the web directly via the terminal.
**To download a file:**
Use the following command to download a file from a specified URL:
```bash
wget <URL>
```
**To save the downloaded file with a different name:**
You can use the `-O` option to specify the desired file name:
```bash
wget -O new_filename <URL>
```
**To download multiple files listed in a text file:**
Use the `-i` option followed by the name of the text file containing a list of URLs:
```bash
wget -i files.txt
```


## Task 6: Timestamp the Data
Modify the command from Task 5 to save the downloaded file with a timestamped name in the format `YYYYmmdd_HHMMSS.json`.

To achieve this, you I used backticks to set the timestamp as the filename.
```bash
wget -O `date +"%Y%m%d_%H%M%S.json"` <URL>
```
This command uses the date command to generate the current timestamp in the desired format and passes it to `wget` as the filename.

## Task 7: Write the Script
'Write a bash script called **weather.sh** in the root of your repository. This script should automate the process from Task 6, saving the weather data to the `data/weather` directory. Make the script executable and test it by running it.'

1. In the root directory, I created a new file named `weather.sh`.
2. I opened the `weather.sh` file and started the script with a shebang (#!), which specifies the interpreter to use for the script. In this case, I used `/bin/bash`, indicating that it’s a Bash script:

```bash
#!/bin/bash
```
3. Next, I added the `wget` command to fetch the weather data from the specified URL. The data is saved with a timestamped filename:
```bash
wget -O data/weather/`date +"%Y%m%d_%H%M%S.json"` https://prodapi.metweb.ie/observations/athenry/today
```
4. After saving `weather.sh`, I checked the files in the directory using the command:
```bash
ls -al
```
This command provides detailed information about the file permissions and ownership.<br>Files are displayed with codes such as:
`r` (read), `w` (write), and `x` (execute) indicate permission levels for the user, group, and others.
`d` at the beginning indicates a directory, while a dash `-` indicates a file.
5. To allow `weather.sh` to be executed, I modified its permissions using `chmod`. The `u+x` option grants execute permission to the user (owner):
```bash
chmod u+x weather.sh
```
This makes `weather.sh` an executable script.

Now that the script is executable, I can run it using the following command:
```bash
./weather.sh
```
The `./` in a command refers to the current directory. When used `./` before a script or program name, it tells the system to look for the executable in the current directory rather than searching through directories listed in the system’s $PATH.


## Task 8: Notebook
The notebook with description created.

## Task 9: pandas
'In the `weather.ipynb` notebook, use the **pandas** function `read_json()` to load in any one of the weather data files you have downloaded with your script. Examine and summarize the data. Use the information provided [data.gov.ie](https://data.gov.ie/dataset/todays-weather-athenry) to write a short explanation of what the data set contains.'

I use the **read_json()** function from pandas to load one of the weather data files.

In [2]:
# Saving the data into a dataframe
data = pd.read_json('data/weather/20241108_162627.json')

# Checking the data
display(data.sample(3))

# Checking the columns
data.columns

Unnamed: 0,name,temperature,symbol,weatherDescription,text,windSpeed,windGust,cardinalWindDirection,windDirection,humidity,rainfall,pressure,dayName,date,reportTime
0,Athenry,13,04n,Cloudy,"""Cloudy""",11,-,SE,135,86,0,1022,Friday,2024-08-11,00:00
1,Athenry,13,04n,Cloudy,"""Cloudy""",13,-,SE,135,87,0,1021,Friday,2024-08-11,01:00
13,Athenry,14,04d,Cloudy,"""Cloudy""",9,-,SE,135,81,0,1020,Friday,2024-08-11,13:00


Index(['name', 'temperature', 'symbol', 'weatherDescription', 'text',
       'windSpeed', 'windGust', 'cardinalWindDirection', 'windDirection',
       'humidity', 'rainfall', 'pressure', 'dayName', 'date', 'reportTime'],
      dtype='object')


### The dataset contains the following information in its columns:
**name**:
The name of the weather station or location where the data was recorded.<br>
**temperature**:<br>
This column holds the current temperature measured in Celsius (°C).<br>
**symbol**:<br>
This column may contain a code or symbol representing the current weather condition (e.g., a sun, cloud, or rain icon). It is a visual representation of the weather.<br>
**weatherDescription**:<br>
A textual description of the weather condition at the time of the observation. For example, "Cloudy".<br>
**windSpeed**:<br>
This column represents the speed of the wind measured in kilometers per hour (km/h).<br>
**windGust**:<br>
This column contains the wind gust speed, representing short bursts or increases in wind speed measured in kilometers per hour (km/h).<br>
**cardinalWindDirection**:<br>
This column indicates the general wind direction using the cardinal directions (Southeast, Northwest).<br>
**windDirection**:<br>
This column provides a more precise wind direction, often expressed in degrees, where 0° corresponds to North, 90° to East, 180° to South, and 270° to West.<br>
**humidity**:<br>
The percentage of moisture in the air at the time of the observation. It is usually given as a percentage (0-100%), where 100% represents fully saturated air (i.e., fog or rain).<br>
**rainfall**:<br>
This column shows the amount of rain that has fallen over a certain period of time (e.g., in millimeters).<br>
**pressure**:<br>
Atmospheric pressure recorded at the weather station, typically given in hectopascals (hPa).<br>
**dayName**:<br>
The name of the day of the week (e.g., Monday, Tuesday, etc.) on which the observation was taken.<br>
**date**:<br>
The specific date when the weather observation was recorded, often formatted as YYYY-MM-DD.<br>
**reportTime**:<br>
The exact time at which the weather observation was reported, typically in HH:MM:SS format. This shows the time of the day when the data was logged.<br>

In [3]:
# The summary statistics for the numerical columns in the dataset.
data.describe()

Unnamed: 0,temperature,windSpeed,windDirection,humidity,rainfall,pressure,date
count,17.0,17.0,17.0,17.0,17.0,17.0,17
mean,13.0,11.117647,129.705882,84.529412,0.0,1020.764706,2024-08-11 00:00:00
min,12.0,7.0,90.0,80.0,0.0,1020.0,2024-08-11 00:00:00
25%,13.0,9.0,135.0,83.0,0.0,1020.0,2024-08-11 00:00:00
50%,13.0,11.0,135.0,84.0,0.0,1021.0,2024-08-11 00:00:00
75%,13.0,13.0,135.0,86.0,0.0,1021.0,2024-08-11 00:00:00
max,14.0,15.0,135.0,89.0,0.0,1022.0,2024-08-11 00:00:00
std,0.612372,2.057983,14.944751,2.45249,0.0,0.562296,


In [4]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 17 entries, 0 to 16
Data columns (total 15 columns):
 #   Column                 Non-Null Count  Dtype         
---  ------                 --------------  -----         
 0   name                   17 non-null     object        
 1   temperature            17 non-null     int64         
 2   symbol                 17 non-null     object        
 3   weatherDescription     17 non-null     object        
 4   text                   17 non-null     object        
 5   windSpeed              17 non-null     int64         
 6   windGust               17 non-null     object        
 7   cardinalWindDirection  17 non-null     object        
 8   windDirection          17 non-null     int64         
 9   humidity               17 non-null     int64         
 10  rainfall               17 non-null     int64         
 11  pressure               17 non-null     int64         
 12  dayName                17 non-null     object        
 13  date   

#### The dataset records weather observations for Athenry on 2024-08-11, including the following:

Temperature: Mean 13°C (range: 12–14°C), showing little variation.<br>
Wind Speed: Average 11.1 m/s (range: 7–15 m/s) with moderate fluctuations.<br>
Wind Direction: Primarily Southeast (mean 129.7°).<br>
Humidity: High, averaging 84.5% (range: 80–89%).<br>
Rainfall: No rainfall recorded.<br>
Pressure: Stable, averaging 1020.8 hPa (range: 1020–1022 hPa).<br>
The day was consistently cloudy with no significant weather changes.<br>

## End.