Maggie Jacoby, 2020

***
# General notes about Python
***

### Adding to python path
```$ nano /etc/paths```

### string and date formatting

**strftime**: datetime object -> string

**strptime**: string -> datetime object

```python
from datetime import datetime
date = datetime.now()
date_as_string = date.strftime('%Y-%m-%d %H%M') # '%Y-%m-%d %H%M' is the format you WANT it to be in
date_as_dt_object = datetime.strptime(date_as_string, '%Y-%m-%d %H%M') #'%Y-%m-%d %H%M' is the format it IS in
```


### Sorting

- ```sorted(x)``` method sorts the given sequence (x) either in ascending order or in descending order and always return the a sorted list. Can be done with different datatypes and on different metrics. 

- ```x.sort()``` returns nothing and changes x. Moreover, sort() is a method of list class and can only be used with lists.

ref: https://www.geeksforgeeks.org/python-difference-between-sorted-and-sort/

### Formatting Decimals

```python
total = total_captured/self.total_per_day
T_perc = 'f{total:.2}'

f'{value:{width}.{precision}}'
```
ref: https://stackoverflow.com/questions/45310254/fixed-digits-after-decimal-with-f-strings



### Writing Text Files

```python
fname = f'/Users/maggie/Desktop/percent_in_3sigma_{col}.txt'
with open (fname, 'w+') as writer:
    .
    .
    .
    writer.write(f'total, {np.mean(total)}\n')
    writer.close()
```




### Read in from a CSV

```python
import csv

with open('/Users/maggie/Documents/Maggie-Grad-School/ARPA-e-ResearchProject/FFA-DoE/iris_full_runs.csv') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)
```

### Merging two dictionaries


```python
L1 = ['a', 'b', 'c', 'd']
L2 = ['x', 'y', 'z']

d1 = {x:i for i,x in enumerate(L1)}
d2 = {x:i for i,x in enumerate(L2, len(L1))} #eunmerates starting at length of L1
```

#### update method (updates d2 inplace)

```d2.update(d1)```

#### \**method (returns third dictionary, does not modify d1, d2)

```d3 = {**d1, **d2}```

### Rename a bunch of files

```python
path_dir = '/Volumes/TOSHIBA-12/H5-red'

home = path_dir.strip('/').split('/')[-1].split('-')[0]

for hub in mylistdir(path_dir, bit='RS', end=False):
    for audio_type in mylistdir(os.path.join(path_dir, hub, 'processed_audio'), bit='audio_', end=False):
        for day in mylistdir(os.path.join(path_dir, hub, 'processed_audio', audio_type), bit='2019-', end=False):
            file_path = os.path.join(path_dir, hub, 'processed_audio', audio_type, day)
            for fname in mylistdir(file_path, bit='.npz', end=True):
                f_hr, f_ext = fname.split('_')
                new = f'{day}_{f_hr}_{hub}_{home}_{f_ext}'
                old_fname = os.path.join(file_path, fname)
                new_fname = os.path.join(file_path, new)
                os.rename(old_fname, new_fname)
```

***

### Some of my favorite little functions

```python
def make_storage_directory(target_dir):
    if not os.path.exists(target_dir):
        os.makedirs(target_dir)
    return target_dir


def mylistdir(directory, bit='', end=True):
    filelist = os.listdir(directory)
    if end:
        return [x for x in filelist if x.endswith(f'{bit}')]
    else:
         return [x for x in filelist if x.startswith(f'{bit}')]
```

### Ternary Operator
`[on_true] if [expression] else [on_false]`




***
***
# Pandas
***

### Change value with .loc


```python
df.loc[df.rh_percent > limit, 'rh_percent'] = np.NaN
```

replaces all values in  ```df['rh_percent']``` that are greater than ```limit``` with ```np.NaN``` 

### Drop or replace values

drop the last 24 rows of data from the df or series ```Baseline```
```python
Baseline = Baseline.drop(Baseline.tail(24).index)
```

drop particular rows:
```python
df = df.drop(columns = ['str_datetime', 'time'])
```

### Code for creating a toy df 

look here too: https://realpython.com/python-pandas-tricks/#2-make-toy-data-structures-with-pandas-testing-module

```python
import pandas as pd
import numpy as np

ind = pd.Index([pd.Timestamp('2019-03-17'), 
                pd.Timestamp('2019-03-18'), 
                pd.Timestamp('2019-03-20'),
                pd.Timestamp('2019-03-21'),
                pd.Timestamp('2019-03-22'),
                pd.Timestamp('2019-03-24'),
                pd.Timestamp('2019-03-25')])
data = {'col':[25,25,24,3,25,24, np.nan]}
df = pd.DataFrame(data, ind)
```

### Make a pandas range of dates

```python
def make_date_range(day1, dayn = None, t1 = '0000', tn = '2359', f='10s'):
    range_start = str(day1 + ' ' + t1[0:2] + ':' + t1[2:4] + ':00')
    range_end = str(day1 + ' ' + tn[0:2] + ':' + tn[2:4] + ':50')
    date_range = pd.date_range(start=range_start, end=range_end, freq=f)
    return date_range   

day_one = make_date_range(day1 = '2019-03-17')
```

for bringing up an interactive console outside jupyter (helpful for copying to output)

```%qtconsole```

***
***
# Plotting
***

Plotting with Seaborn

```python
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

# plt.rcParams['figure.figsize']=(18, 10)
# sns.set()
# sns.set_context()
# sns.set_palette(sns.color_palette("deep"))

def PlotTemps(df, name):
    print(f'Plotting {name} temperature')
    ax = sns.relplot(x='time', y='temp_c', hue='hub', kind='scatter', linewidth=0.05, height=8, aspect=3, data=df, s=30);
    ax.set(xlim=(df['time'].min(), df['time'].max()))
    plt.xlabel('Date', fontsize=24)
    plt.ylabel('Temperature C', fontsize=24)
    plt.title(name, fontsize=38)
    ax.savefig(f'/Users/maggie/Desktop/data_exploration_images/temp/{name}.png')

```

### Kaitlyn's matplotlib preamble

```python
import matplotlib as mpl
import matplotlib.pyplot as plt

mpl_update = {
    'font.size': 16,
    'xtick.labelsize': 14,
    'ytick.labelsize': 14,
    'figure.figsize': [12.0, 8.0],
    'axes.labelsize': 20,
    'axes.titlesize': 20,
    'lines.linewidth': 3,
}
mpl.rcParams.update(mpl_update)
```


### Currently using
```python
sns.set(style='white', rc={"axes.titlesize":24,"axes.labelsize":20, "legend.fontsize":18, "legend.markerscale":2.5, "xtick.labelsize":20, "ytick.labelsize" : 18})
```

***
***
# Git
***

### Creating a new repo


- `$ git init` the directory
- `$ git add .` all files
- commit
- create new repo in github 
- `$ git remote add origin <github location>`
- `$ git push -u origin master`


ref: https://kbroman.org/github_tutorial/pages/init.html

### Remove and stop tracking files in a repo (eg, .DS_Store)  

- Create a .gitignore file and add all files to untrack
- push changes
-
```
$ git rm -r --cached .
```
- add and commit

ref:  
http://www.codeblocq.com/2016/01/Untrack-files-already-added-to-git-repository-based-on-gitignore/  
https://www.pluralsight.com/guides/how-to-use-gitignore-file )

### Files to add to .gitignore

```
*/.ipynb_checkpoints
*/.DS_Store
.DS_Store
.ipynb_checkpoints
.gitignore
__pycache__
.gitattributes
```


### Add to .gitattributes
```
*.ipynb linguist-language=Python
```


***
## Branching

### Make and switch to new branch

```
$ git checkout -b check_pi
```
ref: https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging

### Merging

After updating branch `Maggie-Edits`:


```
    $ git checkout master
    $ git merge Maggie-Edits
```


If you start a merge and then want to cancel it: `$ git merge --abort`

ref: https://www.oreilly.com/library/view/git-pocket-guide/9781449327507/ch07.html

### Delete branch
(locally)

```
$ git branch --delete <branch> 
```

ref: https://gist.github.com/cmatskas/454e3369e6963a1c8c89