# Jupyter Training Notebook 1 - Executing Code, Examining Output, Adding Markdown and DataFrames

This notebook has been created by the CSI-EU team to train you in using the Jupyter Notebook.

You should have already downloaded [Anaconda](https://www.anaconda.com/download) - for your client OS.

#### Launch Anaconda Navigator - Jupyter Lab

By Reading this page you are already interacting with a Jupyter notebook.

#### Example 1 - Executing Code

Below this text area is a block of text click on it and press CTRL + ENTER

In [None]:
import platform
platform.platform()

Normally if you ran the above as a Python script you would see nothing appear in your standard output - this is one of Jupyter's primary uses, being able to output and examine variables - across multiple platforms and languages.

Which ever variable appears as the last line of the script will be displayed 

Click <a href='#Kernels'>here</a> to continue.

<a id="Kernels"></a>
# Kernels

Code in Jupyter runs inside of what is known as a kernel - this notebook for example uses the Python 3 kernel - which means that it executes Python 3 code.

Jupyter was originally created to allow interaction with Python but it is now possible to run Jupyter with a wide variety of kernels, such as BASH, PowerShell, NodeJS and [many more](https://github.com/jupyter/jupyter/wiki/Jupyter-kernels)

If ever you are curious which Kernel you are using look up at the top right hand corner of your Notebook and you will see the kernel name with a circle next to it.

## Exercise 1 - Code Completion

In the code box below add the character "." to the end of platform and hit the TAB key:

![alt text](plat shot.png "Title")

Use the code completition to locate the following properties of platform:

- os.name
- run the sub function platform
- run the sub function os.getcwd

Remember -  execute with CTRL + ENTER

In [None]:
import platform

## Markdown Inside of Your Notebook

As well as being able to store and execute code, Jupyter Notebooks are able to contain rich markdown content to properly document and highlight how to use the code they contain.

For example all of the screen shots and notes you have read so far are written in markdown - even this note you are reading right now.

To view the markdown in a markdown cell you can double click on it.

Once you have finished examining it press CTRL+ENTER to render it once again.

---

## Exercise 2 - Using Markdown

With this cell highlighted click the plus symbol to add a new cell:

![alt text](plusmenu.png "Title")

Click inside of your cell and then using the drop down menu change it's type from Code to Markdown:

![alt text](dropdownmd.png "Title")

Referring to other Markdown cells in this work sheet create text with the following:

- An example of Some H1, H2 and H3 Title text
- An example of bold and italic text
- Insert the image wave.jpg
- Create a list of bullet point items just like the one you've been working through

Click <a href='#Dataframes'>here</a> to continue.

# Dataframes

Technically DataFrames are part of the python module [pandas](https://pandas.pydata.org/pandas-docs/stable/index.html) - but jupyter has great options for rendering DataFrames in an HTML table view.

An example is used below - CTRL + ENTER to Execute

In [None]:
import requests
import pandas as pd
pd.set_option('display.max_rows', 6)
pd.set_option('display.max_columns', 10)
results = requests.get("https://swapi.co/api/starships").json()['results']
results += requests.get("https://swapi.co/api/starships/?page=2").json()['results']
results += requests.get("https://swapi.co/api/starships/?page=3").json()['results']
results += requests.get("https://swapi.co/api/starships/?page=4").json()['results']
starwarsframe = pd.DataFrame(results)
starwarsframe

The only trouble with that data is that it's a bit of a mess, the ... indicate that the rows and columns have been truncated to fit on screen.

Data courtesy of the [Star Wars API](https://swapi.co/)

---


## DataFrames - Selecting Fields

First of all let's have a look at what column names are in this frame:

In [None]:
list(starwarsframe)

In order to select particular columns from a data frame an easy way to do so is to add the values into a list(array) and apply this to the frame:

In [None]:
col_list = ['name','crew','MGLT','starship_class']
starwarsframe[col_list]

---
### Exercise 3 - Selecting Fields

- Using the list of Column Names as a reference, create a query showing the name, cost, model, cargo and films columns
- Using the list of Column Names as a reference, create a query showing the name, MGLT, max_atmosphering_speed and consumables

In [None]:
col_list = []
starwarsframe[col_list]

---
### Selecting Fields Continued

- Set the value of col_list to be the names of all DataFrame columns.
- Use only the first 10 values in this list to show the first 10 columns
- Use only the last 6 values in this list to show the last 6 columns


In [None]:
col_list = list(starwarsframe)
starwarsframe

- Using the same logic display only the first 10 entries in the starwarsframe:

### Exercise 4 - Manipulating Values

It is also possible with pandas to create a series based on values within a dataframe.

- Create a series which shows the count of manufacturers for our star ships:

```python
starwarsframe['manufacturer'].value_counts()
##starwarsframe['manufacturer'].value_counts().p
```
- Execute the first line to view the series and experiment selecting values from it and converting it into a list.
- Use the second line and see if you can figure out how to plot the series as a pie chart.

---
### Exercise 5 - Plotting ~~To Overthrow the Empire~~

Pandas also has the ability to automatically plot dataframes - in the example below we have to convert the dataframe columns above into "real" numeric values.

We use errors='coerce' so that if the value can't be converted it returns as a NaN and then we fill NaN with the value 0 using fillna:

In [None]:
%matplotlib inline
from matplotlib import pyplot as plt

starwarsframe['hyperdrive_rating'] = pd.to_numeric(starwarsframe['hyperdrive_rating'], errors='coerce').fillna(value=0)
starwarsframe[['starship_class','hyperdrive_rating']].plot.barh(x='starship_class')


### Fancier Plots:

You can use the seaborn package to make your plots look even prettier:

In [None]:
import seaborn as sns
f, ax = plt.subplots(figsize=(12, 6))
starwarsframe['crew'] = pd.to_numeric(starwarsframe['crew'], errors='coerce').fillna(value=0)
starplot = sns.barplot(y = 'starship_class', x = 'hyperdrive_rating', data = starwarsframe)
for x_ticks in starplot.get_xticklabels():
    x_ticks.set_rotation(90)
##ax.legend(loc="lower right", frameon=True)
ax.set(ylabel="", xlabel="Hyperdrive Rating")


---

### Exercise 6 - Investigation

Use the Power of the Force - and perhaps Google - to figure out how to do the following:

- Select the First 5 Dataframe rows - without using Pythonic Array Syntax - DataFrames have their own built in operator
- Select the Last 5 Dataframe rows
- Find the total cost (or sum) in credits of all of the ships in the frame
- Plot the hyper drive rating of each ship as a vertical bar chart - with the ship's name as the X axis
- Do the above and experiment with some of the Seaborn additional formatting that is available
- Create a new dataframe which shows the number of different starship_class 
- Plot the above into a bar chart