# Python Handbook
By Aisha Nur Syed 

## Table of Contents:
* [Why Python?](#WhyPython)
* [Notebook Basics](#NoteBasics)
* [Variables](#vars)
    * [Naming Variables](#naming-vars)
* [Libraries](#libs)
* [Data Types](#DT)
    * [Text Data](#TextData)
    * [Numeric Data](#NumData)
    * [True/False Data](#TFData)
    * [Array](#array)
        * [Ordered](#a-ordered)
        * [Mutable](#a-mutable)
* [Loops](#loops)
    * [If Statements](#if)
    * [For Loops](#for)
    * [If Statement in For Loop](#for-if)
* [Functions](#funcs)
* [Non-Spatial Data in Python](#data)
    * [Load Data](#load)
    * [View Data](#view)
    * [Select Data](#select)
    * [Save Data](#save)
- [Translating Python to English](#translate)

## Why Python? <a class="anchor" id="WhyPython"></a>

Python is useful for visualization because it is:
- Precise
- Open source
- Customizable

Many coders keep their past projects for future reference. Feel free to save this notebook to help you understand and/or write code in the future!

## Notebook Basics<a class="anchor" id="NoteBasics"></a>

This notebook will have code cells and markdown cells. This text is in a markdown cell, and is not coded in Python. You can tell this is a markdown cell because there is no **In [    ]:** next to it. You don't have to worry about running / editing these, as I will only use these to explain concepts.

In [None]:
#This block is a code cell and is runnable! If you run this, there will be no output, and you will learn why soon...

In [None]:
#This block will have an output, try running it and see what an output looks like!
print("Python is fun!")

In [None]:
#Sometimes a cell will run, but it will not have an output. This is because it throws an error.
print("this gives an error)

## Variables<a class="anchor" id="vars"></a>

Variables store values. Varaibles will never have quotations around them.

In [None]:
x = 2
y = "cat"
print(x)
print(y)

You can overwrite variables...

In [None]:
x = 3
x

Do math with variables...

In [None]:
z = 5
print(x*z)

and contain many values in one variable.

In [None]:
animals = [y, "dog", "owl"]

print(animals)

### Naming variables<a class="anchor" id="naming-vars"></a>

Variable names:
- Are case sensitive
- Must start with a letter or _
- Can only contain alpha-numeric characters and underscores
- Cannot contain spaces

Variable names should be descriptive and simple. If you want to used multiple words to create your variable name, you can use:
- Camel case (`totalVolume`)
- Underscores (`total_volume`)

In [None]:
#Good and legal variable names
var1 = 2
myName = "Aisha"
waterVol_mL = 7.5

In [None]:
#Bad and legal variable names
myfirstvariable = 2
guesswhatthismeans = ["a", "f", "c"]
mynameis = "Aisha"
amOUntoFwaterImeasuredonmyplantsleaf3 = 7.5

In [None]:
#Illegal variable names
1stvar = 2
my-name = "Aisha"
water volume = 7.5

## Libraries<a class="anchor" id="libs"></a>

Python also contains libraries that contain variables, functions, and data types that we may want to use.

In [None]:
import numpy as np #numpy contains common math variables and allows us to use arrays
import pandas as pd #pandas is used to work with data
import matplotlib.pyplot as plt #used to plot graphs

In [None]:
pi = np.pi
print(pi)

Not all libraries are installed on JupyterHub. You can install libraries that are not part of the environment.

In [None]:
!pip install laspy

#You still have to import libraries after you install them
import laspy

## Data Types<a class="anchor" id="DT"></a>

Variables are stored as different data types. Data types have different features.
- Text: `str`
- Numeric: `float` or `int`
- Boolean (True/False): `bool`
- Iterable collection of items: `array`

### Text data<a class="anchor" id="TextData"></a>

In [None]:
print(y)
print(type(y))

### Numeric data<a class="anchor" id="NumData"></a>

`float` data is contains decimals, and `int` data is an integer.

In [None]:
print(x)
print(type(x))
print(type(3.0))

### True/False data<a class="anchor" id="TFData"></a>

`bool` data is either `TRUE` or `FALSE`. Boolean data is rarely defined by the user (though it can be), and is instead the outcome of checking if a condition is met.

In [None]:
print(type(True))

In [None]:
check1 = 5 > 4
print("5 > 4 is", check1)
print(type(check1))

In [None]:
check2 = 8 > 100
print("8 > 100 is", check2)
print(type(check2))

### Array<a class="anchor" id="array"></a>

An array is an iterable collections of data. They are one variable that can hold multiple values. Each value they hold is called an item or element. Arrays are:
- Ordered
- Mutable
- Can hold non-unique items

#### Ordered<a class="anchor" id="a-ordered"></a>

Choosing an element out of an array is called "indexing" when you access it by its index number. An index number is the ordered position of the element in the array. The index starts from 0.

In [None]:
numArray = [2,5,7,2,3]

#get the first element of the array
numArray[0]

You can choose a negetive number for the index as well. When you use a negative number, you count backwards. Thus, the last element of the array has an index of -1.

In [None]:
numArray[-1]

#### Mutable<a class="anchor" id="a-mutable"></a>

Let's start off with an empty array.

In [None]:
emptyArray = []
print(emptyArray)

You can add one item to an array...

In [None]:
emptyArray.append("cat")
print(emptyArray)

You can add multiple items to an array...

In [None]:
emptyArray.extend(["cat", "dog", "owl"])
print(emptyArray)

## Loops<a class="anchor" id="loops"></a>

Loops check for conditions and run, or iterate over an object, until the condition is no longer `TRUE`. You can use different expressions to write conditions:
- `==` is equal to
- `!=` is not equal to
- `>` is greater than
- `>=` is greater than or equal to
- `<` is less than
- `<=` is less than or equal to 

You can also use logical operators:
- `and`
- `or`
- `not`

<img src = "AndOr.png" style = "height:280px" title = "And vs Or Diagram">

### If Statements<a class="anchor" id="if"></a>

Structure of an if statement: <br>
if condition:<br>
&nbsp; expression <br>

If statements check if the condition is true. If the condition is true, then the expression code runs. If the condition is false, then the expression code does not run.    

In [None]:
# Note that indentation is important. You can either use a space or a tab to indent. Be consistent!

if 5 > 4:
    print("5 is greater than 4.")
    print("The condition is true.")

In [None]:
if 5 > 4:
    print("5 is greater than 4.")
     print("The condition is true.")

In [None]:
if 5 < 4:
    print("5 is less than 4.")
    print("The condition is true.")

### For Loops <a class="anchor" id="for"></a>
Structure of for loop: <br>
for element in array:<br>
&nbsp; expression <br>

For loops iterate over elements in an array to perform the same operation on every item in the given array. Similar to if statements, indentation matters.

In [None]:
exArray  = [2, 5, 3, 2, 4]

for i in range(len(exArray)):
    element = exArray[i]
    print(element)

In [None]:
range(len(exArray))

In [None]:
for ele in exArray:
    print(ele)

### If Statement in For Loop <a class="anchor" id="for-if"></a>
You can combine for loops and if statements to run an expression on elements in an array if the condition is true.

In [None]:
for i in range(len(exArray)):
    element = exArray[i]
    if element != 2:
        print(element)

#### And and Or

In [None]:
zeroToTen = range(0,11)

for i in zeroToTen:
    if i > 2 and i < 8:
        print(i)

In [None]:
for i in zeroToTen:
    if i > 2 or i < 8:
        print(i)

In [None]:
for i in zeroToTen:
    if not(i > 2 and i < 8):
        print(i)

## Functions<a class="anchor" id="funcs"></a>

Functions allow you to run a block of code. You can create your own function...

In [None]:
def function_name(argument1, argument2):
    Sum = argument1 + argument2
    return (Sum)

When you run a function, it is called calling a function...

In [None]:
function_name(3,7)

You can call exisiting functions...

In [None]:
np.sum(exArray)

## Non-Spatial Data in Python<a class="anchor" id="data"></a>

### Load Data<a class="anchor" id="load"></a>

Now that we have an understanding of Python syntax, we can explore data with the popular python library `pandas`. We will look at data from the Institute of Museum and Library Services that contains the name, type, location, and revenue for every museum in the United States. For more information about the dataset, you can visit [this website](https://www.kaggle.com/datasets/imls/museum-directory/). 

In [None]:
# Load data
museums = pd.read_csv("museums.csv", low_memory = False)

### View Data<a class="anchor" id="view"></a>

In [None]:
museums.head()

In [None]:
museums.tail(3)

### Select Data<a class="anchor" id="select"></a>

In [None]:
museums.columns

In [None]:
ColsInterest = ["Museum Name", "Museum Type", "City (Administrative Location)", "Revenue"]
museums[ColsInterest].head()

In [None]:
museums.loc[0:5, ColsInterest]

### Save Data<a class="anchor" id="save"></a>

In [None]:
df = museums.iloc[:, -4:-1]
df

In [None]:
df.to_csv("MuseumTaxInfo.csv")

## Translating Python to English<a class="anchor" id="translate"></a>

In [None]:
x = 3 #set var x to equal 3
y = 4 #set var y to equal 4
print(x, "+", y, "=", x + y) #output: 3 + 4 = 7

In [None]:
digits = list(range(10)) #set digits var to equal range up to 10
GreaterThan5 = [] #creating an empty array calling it GreaterThan5

for i in (range(len(digits))): #iterate over numbers 1-9
    num = digits[i] #num is an element of array digits (numbers 1-9)
    if num > 5:
        GreaterThan5.append(num) #add the number to GreaterThan5 array
        
GreaterThan5 #array of numbers greater than 5

In [None]:
museums["Income"].describe() #selecting income col from museum dataframe and generate desc stats

In [None]:
plt.hist(museums["Income"], bins = 100) #plot a histogram income with 100 bins
plt.show() #show the figure