# Tutorial

## General
### Loading file (.txt)
```python
data = np.loadtxt('ex1data1.txt', delimiter=',')
data.shape # display num of rows, and num of cols. This is a tuple
X = np.vstack(zip(np.ones(m),data[:,0])) # zip: aggregates elements, returning an array of tuples
                                         # vstack: stack arrays in sequence vertically (row wise).
y = data[:, 1] # everything in the second columns
```

### Loading file (.csv)
Pandas way:
```python
titanic_df = pd.read_csv("../input/train.csv")
```

You can also load `.csv` in numpy. However, loading in panda gives you a DataFrame, which allows a lot of optimization and power. Check out how to read csv in numpy [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html). Please note that we used function `loadtxt` from numpy when loading the `.txt` file above. This function is the same as `genfromtxt` without any missing data. And, I recommend using `genfromtxt` over `loadtxt`. Please refer to [here](http://stackoverflow.com/questions/20245593/difference-between-numpy-genfromtxt-and-numpy-loadtxt-and-unpack) to understand why.

## Pandas
### Exploring/Manipulating DataFrame
[Pandas in 10 minutes](http://pandas.pydata.org/pandas-docs/stable/10min.html)

[DataFrame](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe)

```python
# Getting info
df.info()
df.head()
df.tail()
df.mean(0) # argument is axis. Also has many more. Check: http://pandas.pydata.org/pandas-docs/stable/basics.html#descriptive-statistics
df.describe()

# Selecting/Indexing
df['one']
df.iloc[loc] # Select row by integer location. Output: Series
df['one'].value_counts()

# Modifying
df['three'] = df['one'] * df['two']
iris.assign(sepal_ratio = iris['SepalWidth'] / iris['SepalLength']) # same as above, but returns a copy. None destructive
df['flag'] = df['one'] > 2

# Dropping
df = df.drop(['PassengerId','Name','Ticket'], axis=1) # axis=0: column-wise, 1: row-wise
del df['two']
three = df.pop('three')

```

[Working with missing data](http://pandas.pydata.org/pandas-docs/stable/missing_data.html)
``` python
# Filtering
titanic_df["Embarked"] = titanic_df["Embarked"].fillna("S")
```

## Numpy

[Numpy Array](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html)

## Matplotlib

[Basic](http://matplotlib.org/users/pyplot_tutorial.html)
[Tutorial](http://www.labri.fr/perso/nrougier/teaching/matplotlib/)

```python
import matplotlib.pyplot as plt

plt.plot(x, y) # x and y should be vectors
plt.show()
```
