# Introduction to Shell for Data Science

In [1]:
import sys
import datetime as dt

In [2]:
# Notebook Info
nb_info = {'Author':'Simon Zahn', 'Last Updated':dt.datetime.now().strftime('%Y-%m-%d %H:%M'), 'Python Version':sys.version }

for k,v in nb_info.items():
    print((k + ':').ljust(18), str(v))

Author:            Simon Zahn
Last Updated:      2019-07-28 17:46
Python Version:    3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)]


--------------------------

### Table of Contents

1. [Imports and Top Matter](#Imports-and-Top-Matter)
1. [Introduction to Exploratory Data Analysis](#Introduction-to-Exploratory-Data-Analysis)
--------------------------

### Imports and Top Matter
[[back to top]](#Table-of-Contents)

In [6]:
# standard library
# general

# IPython
from IPython.display import display, Image

# analysis
import numpy as np
import pandas as pd
from scipy import stats, special

# plotting
import matplotlib.pyplot as plt
import seaborn as sns

In [5]:
pal = sns.color_palette()

The filesystem manages files and directories (or folders). Each is identified by an absolute path that shows how to reach it from the filesystem's root directory: `/home/repl` is the directory repl in the directory home, while `/home/repl/course.txt` is a file course.txt in that directory, and `/` on its own is the root directory.

To find out where you are in the filesystem, run the command `pwd` (short for "print working directory"). This prints the absolute path of your current working directory, which is where the shell runs commands and looks for files by default.

In [9]:
pwd

'C:\\Users\\Simon\\Documents\\Python Scripts\\Python_Notes'

`pwd` tells you where you are. To find out what's there, type `ls` (which is short for "listing") and press the enter key. On its own, ls lists the contents of your current directory (the one displayed by `pwd`). If you add the names of some files, `ls` will list them, and if you add the names of directories, it will list their contents. For example, `ls /home/repl` shows you what's in your starting directory (usually called your home directory).

Use `ls` with an appropriate argument to list the files in the directory `/home/repl/seasonal` (which holds information on dental surgeries by date, broken down by season). Which of these files is not in that directory?

In [10]:
ls

 Volume in drive C is Local Disk
 Volume Serial Number is 8085-52B5

 Directory of C:\Users\Simon\Documents\Python Scripts\Python_Notes

2019-07-28  17:52    <DIR>          .
2019-07-28  17:52    <DIR>          ..
2019-07-28  17:46    <DIR>          .ipynb_checkpoints
2019-07-28  09:44            32,090 1-1 Basic Python.ipynb
2019-07-28  17:48            32,257 1-5 Importing Data in Python.ipynb
2019-07-28  17:49           661,478 2-1 Statistical Thinking in Python.ipynb
2019-07-28  17:52             4,309 3-1 Intro to Shell for Data Science.ipynb
2019-07-28  09:44           189,229 Basic Graphing.ipynb
2019-07-28  09:44    <DIR>          Book and Course Notes
2019-07-28  15:58    <DIR>          Data
2019-07-28  09:44    <DIR>          Images
2019-07-28  09:44    <DIR>          Modules - Data and Analysis
2019-07-28  09:44               172 README.md
2019-07-28  09:44            61,024 Udacity Intro to Statistics Notes.ipynb
2019-07-28  09:44    <DIR>          Useful_Code_Snippets
    

An absolute path is like a latitude and longitude: it has the same value no matter where you are. A relative path, on the other hand, specifies a location starting from where you are: it's like saying "20 kilometers north".

For example, if you are in the directory `/home/repl`, the relative path seasonal specifies the same directory as `/home/repl/seasonal`, while `seasonal/winter.csv` specifies the same file as `/home/repl/seasonal/winter.csv`. The shell decides if a path is absolute or relative by looking at its first character: if it begins with `/`, it is absolute, and if it doesn't, it is relative.