# Workshop 18 Revised Part 3
## Using `git` on JupyterHub
1. Visit https://github.com/settings/tokens
2. Generate a token (set it to expire end of this year)
3. Add changes and commit as usual
4. Now, after inputting your username, instead of entering your normal password, enter your generated PAT.
5. Changes should be pushed.

### First-Time Errors
If you get the following error:
```
*** Please tell me who you are.

Run

  git config --global user.email "you@example.com"
  git config --global user.name "Your Name"

to set your account's default identity.
Omit --global to set the identity only in this repository.

fatal: unable to auto-detect email address
```

You will need to run the commands and **make sure you have the correct details**.
- `user.email`: your github email
- `user.name`: your github username


### Cloning:
1. Open a terminal (yes it is commandline git for this to work).
2. `git clone URL` (where `URL` is the https url to your github repo).
3. Enter your credentials (if required).
4. Done.

### Pushing:
1. Change directories to inside your repository (`cd NAME_OF_REPO_FOLDER`).
2. `git add -A` (this will stage all changed/untracked files files for the next commit, ignored files are excepted). You can use `git - status` to track changed files before adding.
3. `git commit -m "add some message here"` (make a commit with a message).
4. `git push`
5. Enter your credentials.
6. Here, use the same username
7. **BUT, instead of your password, use the PAT you generated.**
8. Done.

### Pulling:
1. Change directories to inside your repository (`cd NAME_OF_REPO_FOLDER`).
2. `git pull`
3. Done.

## Using `git` Locally
1. Download and install GitHub Desktop: https://desktop.github.com/
2. Login with your details
3. Go to the left side-bar and click on `Add`. 
4. If you already have a group repository created, use `Clone Repository` $\rightarrow$ `URL` and copy/paste the repository URL there. If you are the creator of the group repository, use `Create Repository`, fill in the details, and **make sure you check "Initialize this repository with a README"**.

### Pushing:
1. Go to the left side-bar inside GitHub desktop and you will notice `Changes`.
2. Add a commit message that's useful (i.e "added correlation analysis").
3. Click on the blue `commit to main` button.
4. Push.
5. Done.

### Pulling:
1. Go to the repository inside GitHub desktop
2. You should notice there's an "download" arrow with some changes to pull.
3. Click on it.
5. Done.

# Project 2 Starters

In [1]:
import pandas as pd

df1 = pd.read_csv('price_demand_data.csv')
df2 = pd.read_csv('weather_data.csv')

In [2]:
df1.head()

Unnamed: 0,REGION,SETTLEMENTDATE,TOTALDEMAND,PRICECATEGORY
0,VIC1,1/01/2021 0:30,4179.21,LOW
1,VIC1,1/01/2021 1:00,4047.76,LOW
2,VIC1,1/01/2021 1:30,3934.7,LOW
3,VIC1,1/01/2021 2:00,3766.45,LOW
4,VIC1,1/01/2021 2:30,3590.37,LOW


In [3]:
df2.head()

Unnamed: 0,Date,Minimum temperature (°C),Maximum temperature (°C),Rainfall (mm),Evaporation (mm),Sunshine (hours),Direction of maximum wind gust,Speed of maximum wind gust (km/h),Time of maximum wind gust,9am Temperature (°C),...,9am cloud amount (oktas),9am wind direction,9am wind speed (km/h),9am MSL pressure (hPa),3pm Temperature (°C),3pm relative humidity (%),3pm cloud amount (oktas),3pm wind direction,3pm wind speed (km/h),3pm MSL pressure (hPa)
0,1/01/2021,15.6,29.9,0.0,2.8,9.3,NNE,31.0,13:14,19.2,...,6,N,2,1018.8,28.1,43,5.0,E,13,1015.3
1,2/01/2021,18.4,29.0,0.0,9.4,1.3,NNW,30.0,8:22,23.3,...,7,NNW,17,1013.3,28.7,38,7.0,SW,4,1008.5
2,3/01/2021,17.0,26.2,12.6,4.8,7.1,WSW,33.0,17:55,18.3,...,8,WSW,4,1007.7,23.5,59,4.0,SSW,2,1005.2
3,4/01/2021,16.0,18.6,2.6,3.8,0.0,SSE,41.0,16:03,16.2,...,8,SSE,11,1010.0,18.2,82,8.0,SSW,17,1011.0
4,5/01/2021,15.9,19.1,11.2,1.0,0.0,SSE,35.0,11:02,17.2,...,8,SSE,13,1012.5,18.2,82,8.0,SSE,19,1013.3


In [4]:
# get the dd/mm/yyyy field from SETTLEMENTDATE so we can join it with the weather
df1['Date'] = df1['SETTLEMENTDATE'].apply(lambda x: x.split()[0])
df1.head()

Unnamed: 0,REGION,SETTLEMENTDATE,TOTALDEMAND,PRICECATEGORY,Date
0,VIC1,1/01/2021 0:30,4179.21,LOW,1/01/2021
1,VIC1,1/01/2021 1:00,4047.76,LOW,1/01/2021
2,VIC1,1/01/2021 1:30,3934.7,LOW,1/01/2021
3,VIC1,1/01/2021 2:00,3766.45,LOW,1/01/2021
4,VIC1,1/01/2021 2:30,3590.37,LOW,1/01/2021


Now, we can "merge" the two datasets based on the datetime.
- `df1` (price), we use the newly created `Date` field
- `df2` (weather), we use the existing `Date` field

The merge method used as `left_table.merge(right_table, arguments...)`

In [5]:
data = df1.merge(df2, left_on='Date', right_on='Date')
data

Unnamed: 0,REGION,SETTLEMENTDATE,TOTALDEMAND,PRICECATEGORY,Date,Minimum temperature (°C),Maximum temperature (°C),Rainfall (mm),Evaporation (mm),Sunshine (hours),...,9am cloud amount (oktas),9am wind direction,9am wind speed (km/h),9am MSL pressure (hPa),3pm Temperature (°C),3pm relative humidity (%),3pm cloud amount (oktas),3pm wind direction,3pm wind speed (km/h),3pm MSL pressure (hPa)
0,VIC1,1/01/2021 0:30,4179.21,LOW,1/01/2021,15.6,29.9,0.0,2.8,9.3,...,6,N,2,1018.8,28.1,43,5.0,E,13,1015.3
1,VIC1,1/01/2021 1:00,4047.76,LOW,1/01/2021,15.6,29.9,0.0,2.8,9.3,...,6,N,2,1018.8,28.1,43,5.0,E,13,1015.3
2,VIC1,1/01/2021 1:30,3934.70,LOW,1/01/2021,15.6,29.9,0.0,2.8,9.3,...,6,N,2,1018.8,28.1,43,5.0,E,13,1015.3
3,VIC1,1/01/2021 2:00,3766.45,LOW,1/01/2021,15.6,29.9,0.0,2.8,9.3,...,6,N,2,1018.8,28.1,43,5.0,E,13,1015.3
4,VIC1,1/01/2021 2:30,3590.37,LOW,1/01/2021,15.6,29.9,0.0,2.8,9.3,...,6,N,2,1018.8,28.1,43,5.0,E,13,1015.3
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
11658,VIC1,31/08/2021 21:30,5075.93,MEDIUM,31/08/2021,11.0,20.1,0.0,5.8,3.6,...,7,N,17,,19.4,43,6.0,N,30,1012.2
11659,VIC1,31/08/2021 22:00,4861.91,MEDIUM,31/08/2021,11.0,20.1,0.0,5.8,3.6,...,7,N,17,,19.4,43,6.0,N,30,1012.2
11660,VIC1,31/08/2021 22:30,4748.74,MEDIUM,31/08/2021,11.0,20.1,0.0,5.8,3.6,...,7,N,17,,19.4,43,6.0,N,30,1012.2
11661,VIC1,31/08/2021 23:00,4620.09,MEDIUM,31/08/2021,11.0,20.1,0.0,5.8,3.6,...,7,N,17,,19.4,43,6.0,N,30,1012.2
