# First, you need to download a dataset to use

https://www.cryptodatadownload.com/cdd/Gdax_BTCGBP_1h.csv

### This dataset must be saved in the same location as the notebook
you need to go in and remove the first row for it to play nice

### Then you need to do your imports (modules and datasets)

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv("Gdax_BTCGBP_1h.csv")

The pd.read_csv function returns a data frame object generated by the .csv file.

Dataframes work almost like 2-dimensional arrays, and can be referenced by row and column.

### By typing "df" and running the cell, you can view the dataframe as python sees it.

In [None]:
df

### The df.dtypes function will return a list of all the columns in the dataframe, and their associated datatypes

In [None]:
df.dtypes

### This is good, but the dataframe could be improved upon...
we can perform some processing on the dataset by renaming a few of the columns to more python-friendly names.

The df.rename function allows sections of the dataframe to be renamed.

Here, the structure:

columns={"Volume From": "VolFrom", "Volume To": "VolTo"}

shows that we are taking the column called "Volume From" and changing its name to "VolFrom". This is basically just because it's easier to type.

In [None]:
df = (
    df.rename(columns={"Volume From": "VolFrom", "Volume To": "VolTo"}))
df

# Let's visualise some of this data...

Using the matplotlib library we imported earlier, we can plot a graph of the closing price of bitcoin over the years.

In [None]:
mygraph = plt.plot(df.Close)

### Now lets do some processing on the data itself

* **lambda** functions are anonymous single line functions that can be used for short periods of time.
* They behave like normal functions, and are useful when you only want to use a function **once**.

The code below will create a new column called "**buy**" which will decide if it would have been a good idea to buy bitcoin that day.

* The lambda function takes an argument *x*, and returns the result of the comparison: if (tommorow's closing > todays closing)
* This boolean output is then parsed as an integer so it can be presented better within the dataframe.

In [None]:
df = df.assign(buy = lambda x: (x.Close.shift(-1) > x.Close).astype(int))
df