<img src="https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png" style="float: left; margin: 10px;"> 

# Python Intro 1: Data Types

Authors: _Tim Book & Jeff Hale_

---

### Learning objectives
*After this lesson, you will be able to:*
- Learn to use JupyterLab features with Jupyter notebooks
- Create numbers, strings, booleans, lists, dictionaries and other data structures
- Demonstrate arithmetic, boolean, and string operations
- Use list and dictionary methods

### Prerequisites
*Before this lesson, you should already be able to:*

- Be able to navigate the command line
- Have Python installed in an active conda environment


## Let's chat about JupyterLab and Juptyer notebooks

- JupyterLab vs Jupyter notebooks
- Seeing Files
- Kernels
- Running cells
- State
- Code cells vs Markdown cells
- [Keyboard shortcuts](https://gist.github.com/discdiver/9e00618756d120a8c9fa344ac1c375ac)

# Markdown
I am typing in `markdown`. It's a subset of HTML. Nothing "runs" in a Markdown cell. 

## It's good for communication!

###  Things can be different sizes

*Go to the __Help__ menu for a Markdown tutorial if needed.* 😀


In Slack and Jupyter, you can format your code with Markdown. This code will not run. 

One backtick:

`this is inline code` 


Three backticks:
``` 
this is a block of code
ok?
```

### In Slack - Give a smile, neutral, or frown emoji reaction for your comfort level with the following:

- Making a list.
- Getting the first item from a list.
- Getting the last item from a list.
- Making a dictionary.
- Getting a value from a dictionary.
- Making a list of dictionaries.
- Retrieving a value from a dictionary that is in a list.

## Python is a Calculator
_(...just like every other programming language)_

Let's learn some common mathematical operations with integer values.

In [1]:
# This is a code cell.
# This is a comment in a code cell.


# Addition
45 + 5

50

In [None]:
# Comments don't do anything but communicate (and stop the code in them from running)

In [2]:
# Subtraction (note we can have negative numbers!)
45 - 5


40

In [None]:
# Multiplication


In [None]:
# Division


In [None]:
# Exponentiation (do NOT use ^, use **)


In [None]:
# Modular division ("mod" for short, use %) (Remainder Division -- only returns the remainder)


In [None]:
# Floor division (ie "round down" division, use // -- doesn't include the remainder)


In [None]:
#Python follows PEMDAS order of operations 


## Variables
Python is just a fancy calculator. It's also important for us to be able to save numbers as **variables** so we can reference them later without memorizing their value.

In [None]:
#can assign variables in python

#won't display anything until you call for the variable again
# x

In [None]:


#demo calling x in a cell above

## Naming Rules

> _There are only two hard things in Computer Science: cache invalidation and naming things._ - Phil Karlton

You can _pretty much_ name variables whatever you want. But, there are a few rules we should follow. Some are strict, some are just good manners.

### Variable naming rules (mandatory)
- Names can only consist of letters, underscores, and numbers.
- Names can't begin with numbers.
- You can't name a variable after a built-in Python keyword (eg `if`).

In [None]:
#examples:
#if
#int
#for
#with
#etc.

### Variable naming rules (good manners)
- Names should _**always**_ be descriptive (ie, don't name variables `x` and `df`)
- No capital letters!
- Variables should not begin with an underscore (this means something special)
- Multi-word variables should be in `snake_case`. All lower case separated by underscores.
- Technically, you _can_ name variables after built-in Python _functions_ (like `print`), but it's an _extremely_ bad idea to do so.
    - Rule of thumb: If a variable name turns green, don't use it!
    

In [None]:
#snake case
my_car

#instead of:
#MyCar
#my-car

### Math exercise :
Recall the quadratic formula you may have learned for solving a polynomial equation with coefficients $a$, $b$, $c$:

$$ x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a} $$

### Let's turn that into code.

In [4]:
# Assign a, b and c for the quadratic equation to some value
a = -1
b = -8
c = 15
d = b * b - 4 * a * c
x = ((-1*b) + (d ** 0.5))/(2 * a)

In [5]:
# Slack thread: Give me the code to produce one of the two roots!
x

#Hint: taking something to the half power is the same as the square root

-9.567764362830022

In [None]:
#do it for "+" first, then minus

In [None]:
#x_neg

## So, what is a "data type"?

Data can come in various **types.** We've already seen two types!

1. The `int` type: Integers with no decimal part (eg `2`, `-30`, `14`)
1. The `float` type: Numbers with a decimal part, even if that part is zero (eg `2.5`, `3.141`, `2**0.5`, `-3.0`)

Curious about what an object's data type is? Simply use the `type()` function to ask!

```python
type(3) # int
type(4.2) # float
```

In [None]:
#Int example

In [None]:
#float example

Get in the habit of checking the type of something when you don't know what it is.

## Strings

---

Strings are how we store text data in Python. Strings are _strings of characters_ between either double quotes (`"`) or single quotes (`'`). Python doesn't care which as long as they match.


In [None]:
#double quotes


In [None]:
#single quotes

In [None]:
#what about an apostrophe? incorrect example


In [None]:
#what about an apostrophe? incorrect example


In [6]:
# Escape characters
"this is a \n new line"

'this is a \n new line'

In [7]:
#print with escape character
print("This is a \n line")

This is a 
 line


The **print** command prints the value assigned to the variable `x` on the screen. 

The **print** statement removes the quotations, whereas just running they jupyter cell with `x` at the last line leaves the quotations in.

You can use 'single' or "double" quotations to create a string variable.

## f-strings
String formatting technique. Use f-strings to include a variable inside a string.

In [8]:
#Assign a variable "name" with your name
name = "Shivaji"

In [10]:
#f"Hi. How are you. My name is {name}."
f"Hi. My name is {name}"

'Hi. My name is Shivaji'

Just put an `f` immediately before the quotes and put the variable in squiggly brackets.

Or do some math in the brackets.

In [None]:
#f'Hi. How are you. My name is {3+5}.' 

## String Math!
Besides simply storing text, we can also operate on strings. Everything in Python has a **type**, and types can be operated on with their respective **methods**. Methods are actions we can perform on an object using the following syntax:

```python
variable.method(arguments)
```

In [18]:
#make two different strings, s1 & s2
s1 = "Be quiet"
s2 = "this is a library"

s1


In [19]:
#show s2
s2


'this is a library'

In [21]:
#add the two strings together, create new variable
s1 + s2


'Be quietthis is a library'

In [None]:
#display the new variable


In [None]:
#format it better


#### Uppercasing is a method in Python

In [None]:
#.upper(

In [None]:
#.lower()


In [None]:
#.title()


In [24]:
mycommand = s1 + " " + s2
mycommand

'Be quiet this is a library'

#### There are plenty of commands. let's try out Jupyter's autocomplete feature to see what we can do!

In [None]:
#use . tab complete to see what's available

#### Google "common python string commands" 

## Slicin' Strings
We may also want to pick apart our strings. We can do this by **indexing** or **slicing**. In fact, you can index or slice several different types in Python. For example:

- Strings
- Lists
- Tuples
- Sets

---

All of the above types can be accessed using brackets in the following ways:

- **`s[0]`** References the first element
- **`s[0:4]`** References the first **4** elements of a string from index **`0`**.
- **`s[-1]`** Reference the _first_ item in reverse order (or the last item).
- **`s[-2]`** Reference the _second_ item in reverse order (second to last item).
- **`s[0:-3]`** Reference everyting _execept the last 3_ elements.


In [None]:
#show s1

In [None]:
#show s1 first index

In [None]:
# First character


In [None]:
# Second character


In [None]:
# Second through fourth characters
# Remember it is NOT inclusive of last digit


In [None]:
# First 5 characters


In [None]:
# Last character [-1]


In [None]:
# Second last character [-2]

In [None]:
# Last 5 characters


In [None]:
#Slack: from my command, how would you isolate/slice the word "this"?


In [26]:
#separate words in a string using "split"
# shift + tab gives you the function signature

list_of_words = mycommand.split()
list_of_words

['Be', 'quiet', 'this', 'is', 'a', 'library']

In [27]:
#Demo indexing into the list
list_of_words[2]

'this'

## Collection Types!

![](assets/Collections.png)

We often want to store many values in one variable. A _collection_. There are several collection types in Python. The first and most common is...

### Lists
Lists are mutable, heterogeneous collections. You can add and subtract things from lists.

- **Mutable** = They can be changed
- **Heterogeneous** = They can hold values of different data types

In [28]:
#make a list of names
names = ['Holly', 'Mabel', 'Tom', 'Jamar', 'Melissa']

In [29]:
#display the list
names

['Holly', 'Mabel', 'Tom', 'Jamar', 'Melissa']

In [30]:
#display type
names[0]

'Holly'

In [31]:
# Reference 1st item
names[1:]

['Mabel', 'Tom', 'Jamar', 'Melissa']

In [32]:
# Reference 2nd item
names[0:5:2]

['Holly', 'Tom', 'Melissa']

In [33]:
# The last one
names[::-1]

['Melissa', 'Jamar', 'Tom', 'Mabel', 'Holly']

In [None]:
# Every other name
#first element: start, second element: end, third element: step


In [34]:
# Backwards! Reverse a list
names[::-1]

['Melissa', 'Jamar', 'Tom', 'Mabel', 'Holly']

### List Methods

In [35]:
# Append
names.append('Sumati')

In [36]:
names

['Holly', 'Mabel', 'Tom', 'Jamar', 'Melissa', 'Sumati']

In [37]:
names.remove('Holly')

In [38]:
# Remove
names

['Mabel', 'Tom', 'Jamar', 'Melissa', 'Sumati']

In [39]:
# Join??? Join is different but powerful.
"; ".join(names)

'Mabel; Tom; Jamar; Melissa; Sumati'

In [42]:
#sorting if necessary
" and ".join(names)
sorted_names = sorted(names)
sorted_names

['Jamar', 'Mabel', 'Melissa', 'Sumati', 'Tom']

### Tuples
Tuples are less common than lists, but very similar. They are immutable and heterogeneous. We don't usual create them often but they do get returned from various functions.

- **Immutable** = Once made, they can never be changed.
- **Heterogeneous** = They can contain values of different types

For our purposes, you can just think of tuples as immutable lists. Their existence is partly legacy from a time when they were more useful. Traditionally they're only used to hold short sequences of variables.

In [44]:
#tuples in (), list in []
family = ("Pratima", "Krish")
family

('Pratima', 'Krish')

In [45]:
#check type
type(family)

tuple

In [46]:
#Can't append to a tuple
family.append('brother')

AttributeError: 'tuple' object has no attribute 'append'

## Sets
Sets are unordered, unique collections. Just like traditional sets in a math class.

We'll see sets rarely, but it's worth knowing they exist. They come in handy in coding challenges. 😉

In [47]:
#type in a set 
my_grades = {1, 23, 5, 6, 7, 4, 5, 6, 7, 4, 5}

In [48]:
#show set
my_grades

{1, 4, 5, 6, 7, 23}

In [49]:
#check type
type(my_grades)

set

## Dictionaries!

![](assets/phonebook.jpeg)

Dictionaries are very common. They're unordered, mutable key-value pairs. 

Think of them like an actual dictionary. The key is the "word" and the value is the "definition".

In [51]:
#make a dictionary of state capitals as a quiz
state_capitals = {'Washington': ['Olympia', 'Seattle'], 
                  'Texas':['Austin', 'Dallas'],
                  'Colorado': ['Denver', 'Boulder', 'Aspen']}


In [52]:
# Indexing -- grab the state "Texas"
state_capitals['Texas']

['Austin', 'Dallas']

In [53]:
# Bzzt! Remember, dictionaries are unordered. No such thing as "first" element
# music[0]
state_capitals['Massachusetts': ['Boston']]

TypeError: unhashable type: 'slice'

However, the items in them will print in the order that they were inserted! (Dictionaries are insertion ordered now, this was not the case in early versions of Python)

In [None]:
#delete a key-value pair


In [None]:
#add a state capital


`my_dict.get(some_key)`  is the preferred way to access a value in a dictionary!

## Dictionaries are a big deal!

Dictionaries can get really big and really complicated, like the one below. 

This is a very efficient way to store complicated data that don't fit neatly in a spreadsheet. In fact, dictionaries are the data type used by most web APIs! We'll need to parse big dictionaries to get data from the internet!

Dictionaries are also faster for a computer to find information in than lists.

In [56]:
authors = {
    "J.R.R. Tolkien": {
        "genre": "fantasy",
        "books": [
            "The Fellowship of the Ring",
            "The Two Towers",
            "The Return of the King"
        ],
        "active": False
    },
    "J.K. Rowling": {
        "genre": "fantasy",
        "books": [
            "The Sorcerer's Stone",
            "The Chamber of Secrets",
            "The Prisoner of Azkaban",
            "The Goblet of Fire",
            "The Order of the Phoenix",
            "The Half-Blood Prince",
            "The Deathly Hallows"
            
        ],
        "active": True,
        "phone": {
            "home": "(281) 330-8004",
            "work": "(800) HP0-TTER"
        }
    },
    "Suzanne Collins": {
        "genre": "science fiction",
        "books": ["The Hunger Games",
                 "Catching Fire",
                 "Mockingjay"],
        "phone": None,
        "active": True
    }
}

In [57]:
#check type of authors
type(authors)

dict

What `types` are in the dictionary?

In [59]:
#check type of J.R.R. Tolkien
type(authors["J.R.R. Tolkien"])

dict

In [60]:
#go one level deeper, check type of 'books'
authors["Suzanne Collins"]["books"][2]

'Mockingjay'

In [None]:
#go another level deeper what are the types within books?


In [63]:
# Code To get Suzanne Collins' 3rd book
authors["Suzanne Collins"]["books"][2]

'Mockingjay'

## `.items` gives you keys and values

## Booleans

![](assets/boole.jpg)

Booleans are variables that only have two different values: `True` and `False`. 

They're named after their founder, **George Boole** and will come in real handy when we discuss control flow this afternoon.

You can do three operations with booleans : `not`, `and`, and `or`.

In [None]:
# set a variable to True


In [None]:
#check it's type


In [None]:
#set a variable to False


`not`: Simply gives the opposite

In [None]:
#set a variable to not True


`and`: A and B only yields `True` if both A and B are true




In [None]:
#set three boolean values


In [None]:
#test the first two


In [None]:
#test the next two


`or`: `A or B` only yields `False` if both A and B are false

In [None]:
#test two with or


## Cool story, Boole
So what? We rarely actually define variables to be `True` or `False`. More often, we get them from asking Python math problems.

In [64]:
# Greater than
5 > 3

True

In [65]:
# Less than
5 < 3

False

In [66]:
# Greater than or equal to
5>= 5

True

In [67]:
# Break it into parts WITHIN parentheses
(3 < 4) and (4 > 5)

False

In [68]:
# Not equal to
3 != 4

True

In [69]:
# Equal to
5 == 5

True

Single equals sign is assignment NOT checking whether two values are the same. 

Common error!

## Summary

We covered:

- Basic Jupyter Notebook use
- Basic math in Python
- String manipulation in Python
- Collection data types in Python
- Booleans in Python

## Check for Understanding

- Make a list
- Get the second to last item from the list
- Make a dictionary
- Get a value from the dictionary
- Make a string variable
- Reverse the string