<img src="https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png" style="float: left; margin: 10px;"> 

# Python Intro 1: Data Types

Authors: _Tim Book & Jeff Hale_

---

### Learning objectives
*After this lesson, you will be able to:*
- Learn to use JupyterLab features with Jupyter notebooks
- Create numbers, strings, booleans, lists, dictionaries and other data structures
- Demonstrate arithmetic, boolean, and string operations
- Use list and dictionary methods

### Prerequisites
*Before this lesson, you should already be able to:*

- Be able to navigate the command line
- Have Python installed in an active conda environment


## Let's chat about JupyterLab and Juptyer notebooks

- JupyterLab vs Jupyter notebooks
- Seeing Files
- Kernels
- Running cells
- State
- Code cells vs Markdown cells
- [Keyboard shortcuts](https://gist.github.com/discdiver/9e00618756d120a8c9fa344ac1c375ac)

# Markdown
I am typing in `markdown`. It's a subset of HTML. Nothing "runs" in a Markdown cell. 

## It's good for communication!

###  Things can be different sizes

*Go to the __Help__ menu for a Markdown tutorial if needed.* 😀


In Slack and Jupyter, you can format your code with Markdown. This code will not run. 

One backtick:

`this is inline code` 


Three backticks:
``` 
this is a block of code
ok?
```

### In Slack - Give a smile, neutral, or frown emoji reaction for your comfort level with the following:

- Making a list.
- Getting the first item from a list.
- Getting the last item from a list.
- Making a dictionary.
- Getting a value from a dictionary.
- Making a list of dictionaries.
- Retrieving a value from a dictionary that is in a list.

## Python is a Calculator
_(...just like every other programming language)_

Let's learn some common mathematical operations with integer values.

In [4]:
# Addition

In [1]:
# Comments don't do anything but communicate (and stop the code in them from running)

In [2]:
# Subtraction (note we can have negative numbers!)

In [3]:
# Multiplication


In [4]:
# Division

In [5]:
# Exponentiation (do NOT use ^, use **)


In [6]:
# Modular division ("mod" for short, use %) (Remainder Division -- only returns the remainder)


In [7]:
# Floor division (ie "round down" division, use // -- doesn't include the remainder)


In [8]:
#Python follows PEMDAS order of operations 


In [None]:
# without PEMDAS


## Variables
Python is just a fancy calculator. It's also important for us to be able to save numbers as **variables** so we can reference them later without memorizing their value.

In [5]:
#can assign variables in python

#won't display anything until you call for the variable again
x = 3

In [6]:
x

#demo calling x in a cell above

3

In [7]:
y = 4
z = 2

In [8]:
y

4

In [9]:
z

2

In [11]:
# Now you can add variables together!
x + y

7

In [12]:
# With all 3
(x + y) * z

14

## Naming Rules

> _There are only two hard things in Computer Science: cache invalidation and naming things._ - Phil Karlton

You can _pretty much_ name variables whatever you want. But, there are a few rules we should follow. Some are strict, some are just good manners.

### Variable naming rules (mandatory)
- Names can only consist of letters, underscores, and numbers.
- Names can't begin with numbers.
- You can't name a variable after a built-in Python keyword (eg `if`).

In [None]:
#examples:
if
int
for
with
# etc.

### Variable naming rules (good manners)
- Names should _**always**_ be descriptive (ie, don't name variables `x` and `df`)
- No capital letters!
- Variables should not begin with an underscore (this means something special)
- Multi-word variables should be in `snake_case`. All lower case separated by underscores.
- Technically, you _can_ name variables after built-in Python _functions_ (like `print`), but it's an _extremely_ bad idea to do so.
    - Rule of thumb: If a variable name turns green, don't use it!
    

In [11]:
#snake case


#instead of:
#MyCar
#my-car

### Math exercise :
Recall the quadratic formula you may have learned for solving a polynomial equation with coefficients $a$, $b$, $c$:

$$ x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a} $$

### Let's turn that into code.

In [23]:
# Assign a, b and c for the quadratic equation to some value
a = -1
b = -8
c = 15

In [21]:
# Slack thread: Give me the code to produce one of the two roots!

#Hint: taking something to the half power is the same as the square root

In [12]:
#do it for "+" first, then minus


In [25]:
x = (-b + (b**2-4*a*c) **0.5)/2*a

In [26]:
x

-9.567764362830022

## So, what is a "data type"?

Data can come in various **types.** We've already seen two types!

1. The `int` type: Integers with no decimal part (eg `2`, `-30`, `14`)
1. The `float` type: Numbers with a decimal part, even if that part is zero (eg `2.5`, `3.141`, `2**0.5`, `-3.0`)

Curious about what an object's data type is? Simply use the `type()` function to ask!

```python
type(3) # int
type(4.2) # float
```

In [27]:
#Int example
type(5)

int

In [28]:
#float example
type (3.78)

float

Get in the habit of checking the type of something when you don't know what it is.

## Strings

---

Strings are how we store text data in Python. Strings are _strings of characters_ between either double quotes (`"`) or single quotes (`'`). Python doesn't care which as long as they match.


In [29]:
#double quotes
"This is a string"

'This is a string'

In [30]:
#single quotes
'This is a string'

'This is a string'

In [31]:
#what about an apostrophe? incorrect example
'This is a contraction, ya'll'

SyntaxError: invalid syntax (<ipython-input-31-3d37775455e2>, line 2)

In [32]:
#what about an apostrophe? correct example
"This is a contraction, ya'll"

"This is a contraction, ya'll"

In [33]:
# Escape characters
"This is a \n new line"

'This is a \n new line'

In [34]:
#print with escape character
print ("This is a \n new line")

This is a 
 new line


The **print** command prints the value assigned to the variable `x` on the screen. 

The **print** statement removes the quotations, whereas just running they jupyter cell with `x` at the last line leaves the quotations in.

You can use 'single' or "double" quotations to create a string variable.

## f-strings
String formatting technique. Use f-strings to include a variable inside a string.

In [39]:
#Assign a variable "name" with your name
name = "Chris"

In [40]:
f"Hi. How are you. My name is {name}."

'Hi. How are you. My name is Chris.'

Just put an `f` immediately before the quotes and put the variable in squiggly brackets.

Or do some math in the brackets.

In [42]:
f"Hi, how are you? I'm doing some math {3+5}."

"Hi, how are you? I'm doing some math 8."

## String Math!

In [45]:
#make two different strings, s1 & s2
s1 = "Be quiet"
s2 = "this is a library"

In [46]:
#show s1
s1

'Be quiet'

In [47]:
#show s2
s2

'this is a library'

In [48]:
#add the two strings together, create new variable
my_command = s1 + s2

In [49]:
#display the new variable
my_command

'Be quietthis is a library'

In [50]:
#format it better
my_command = s1 + ", " + s2

In [51]:
my_command

'Be quiet, this is a library'

## Methods

Besides simply storing text, we can also operate on strings. Everything in Python is an **object**, and most objects can be operated on with their respective **methods**. Methods are actions we can perform on an object using the following syntax:

```python
variable.method(arguments)
```

#### Uppercasing is a method in Python

In [52]:
#.upper()
my_command.upper()

'BE QUIET, THIS IS A LIBRARY'

In [53]:
#.lower()
my_command.lower()

'be quiet, this is a library'

In [54]:
#.title()
my_command.title()

'Be Quiet, This Is A Library'

#### There are plenty of commands. let's try out Jupyter's autocomplete feature to see what we can do!

In [None]:
#use . tab complete to see what's available
my_command.

#### Google "common python string commands" 

## Slicin' Strings
We may also want to pick apart our strings. We can do this by **indexing** or **slicing**. In fact, you can index or slice several different types in Python. For example:

- Strings
- Lists
- Tuples
- Sets

---

All of the above types can be accessed using brackets in the following ways:

- **`s[0]`** References the first element
- **`s[0:4]`** References the first **4** elements of a string from index **`0`**.
- **`s[-1]`** Reference the _first_ item in reverse order (or the last item).
- **`s[-2]`** Reference the _second_ item in reverse order (second to last item).
- **`s[0:-3]`** Reference everyting _execept the last 3_ elements.


In [55]:
#show s1
s1

'Be quiet'

In [56]:
#show s1 first index
s1[0]

'B'

In [57]:
# Second character
s1[1]

'e'

In [58]:
# Third character
s1[2]

' '

In [62]:
# Second through fourth characters
# It is NOT inclusive of last digit
s1[1:4]

'e q'

In [63]:
# First 5 characters
s1[0:5]

'Be qu'

In [64]:
# Last character [-1]
s1[-1]

't'

In [65]:
# Second last character [-2]
s1[-2]

'e'

In [66]:
# Last 5 characters
s1[-5:]

'quiet'

In [74]:
#Slack: from my command, how would you isolate/slice the word "this"?
s2[0:4]

'this'

In [75]:
#separate words in a string using "split"
# shift + tab gives you the function signature
my_command.split()

['Be', 'quiet,', 'this', 'is', 'a', 'library']

In [76]:
list_of_words = my_command.split()

In [40]:
#Demo indexing into the list


## Collection Types!

![](assets/Collections.png)

We often want to store many values in one variable. A _collection_. There are several collection types in Python. The first and most common is...

### Lists
Lists are mutable, heterogeneous collections. You can add and subtract things from lists.

- **Mutable** = They can be changed
- **Heterogeneous** = They can hold values of different data types

In [77]:
#make a list of names
names = ['Mei Ling', 'Lauren', 'Michael', 'Ranga', 'Nyan']

In [78]:
#display the list
names

['Mei Ling', 'Lauren', 'Michael', 'Ranga', 'Nyan']

In [79]:
#display type
type(names)

list

In [80]:
# Reference 1st item
names[0]

'Mei Ling'

In [81]:
# Reference 2nd item
names[1]

'Lauren'

In [82]:
# The last one
names[-1]

'Nyan'

In [83]:
# Every other name
#first element: start, second element: end, third element: step
names[0:5:2]

['Mei Ling', 'Michael', 'Nyan']

In [84]:
# Backwards! Reverse a list
names[::-1]

['Nyan', 'Ranga', 'Michael', 'Lauren', 'Mei Ling']

### List Methods

In [87]:
# Append
names.append('Mollika')

In [88]:
names

['Mei Ling', 'Lauren', 'Michael', 'Ranga', 'Nyan', 'Mollika']

In [89]:
# Remove
names.remove('Michael')

In [90]:
names

['Mei Ling', 'Lauren', 'Ranga', 'Nyan', 'Mollika']

In [92]:
# Join??? Join is different but powerful.
" and ".join(names)

'Mei Ling and Lauren and Ranga and Nyan and Mollika'

In [93]:
#sorting if necessary
names.sort()

In [94]:
names

['Lauren', 'Mei Ling', 'Mollika', 'Nyan', 'Ranga']

### Tuples
Tuples are less common than lists, but very similar. They are immutable and heterogeneous. We don't usual create them often but they do get returned from various functions.

- **Immutable** = Once made, they can never be changed.
- **Heterogeneous** = They can contain values of different types

For our purposes, you can just think of tuples as immutable lists. Their existence is partly legacy from a time when they were more useful. Traditionally they're only used to hold short sequences of variables.

In [96]:
#tuples in (), list in []
family = ("Jacob", "Lunchbox")

In [97]:
family

('Jacob', 'Lunchbox')

In [98]:
#check type
type(family)

tuple

In [99]:
#Can't append to a tuple
family.append('dog')

AttributeError: 'tuple' object has no attribute 'append'

## Sets
Sets are unordered, unique collections. Just like traditional sets in a math class.

We'll see sets rarely, but it's worth knowing they exist. They come in handy in coding challenges. 😉

In [100]:
#type in a set 
my_grades = {1, 23, 5, 6, 7, 4, 5, 6, 7, 4, 5}

In [101]:
my_grades

{1, 4, 5, 6, 7, 23}

In [57]:
#show set


In [None]:
#check type


## Dictionaries!

![](assets/phonebook.jpeg)

Dictionaries are very common. They're unordered, mutable key-value pairs. 

Think of them like an actual dictionary. The key is the "word" and the value is the "definition".

In [102]:
state_capitals = { 'Washington': ['Olympia', 'Seattle'],
                 'Texas': ['Austin', 'Dallas'],
                 'Colorado': ['Denver', 'Boulder', 'Aspen']}

In [103]:
# Indexing -- grab the state "Texas"
state_capitals.get('Texas')

['Austin', 'Dallas']

In [104]:
# Bzzt! Remember, dictionaries are unordered. No such thing as "first" element
state_capitals['Texas']

['Austin', 'Dallas']

However, the items in them will print in the order that they were inserted! (Dictionaries are insertion ordered now, this was not the case in early versions of Python)

In [108]:
#delete a key-value pair
del state_capitals['Texas']

In [109]:
state_capitals

{'Washington': ['Olympia', 'Seattle'],
 'Colorado': ['Denver', 'Boulder', 'Aspen']}

In [114]:
#add a state capital
state_capitals['Massachusetts'] = ['Boston', 'Salem', 1000_000]

In [115]:
state_capitals

{'Washington': ['Olympia', 'Seattle'],
 'Colorado': ['Denver', 'Boulder', 'Aspen'],
 'Massachusetts': ['Boston', 'Salem', 1000000]}

`my_dict.get(some_key)`  is the preferred way to access a value in a dictionary!

## Dictionaries are a big deal!

Dictionaries can get really big and really complicated, like the one below. 

This is a very efficient way to store complicated data that don't fit neatly in a spreadsheet. In fact, dictionaries are the data type used by most web APIs! We'll need to parse big dictionaries to get data from the internet!

Dictionaries are also faster for a computer to find information in than lists.

In [117]:
authors = {
    "J.R.R. Tolkien": {
        "genre": "fantasy",
        "books": [
            "The Fellowship of the Ring",
            "The Two Towers",
            "The Return of the King"
        ],
        "active": False
    },
    "J.K. Rowling": {
        "genre": "fantasy",
        "books": [
            "The Sorcerer's Stone",
            "The Chamber of Secrets",
            "The Prisoner of Azkaban",
            "The Goblet of Fire",
            "The Order of the Phoenix",
            "The Half-Blood Prince",
            "The Deathly Hallows"
            
        ],
        "active": True,
        "phone": {
            "home": "(281) 330-8004",
            "work": "(800) HP0-TTER"
        }
    },
    "Suzanne Collins": {
        "genre": "science fiction",
        "books": ["The Hunger Games",
                 "Catching Fire",
                 "Mockingjay"],
        "phone": None,
        "active": True
    }
}

In [118]:
#check type of authors
type(authors)

dict

What `types` are in the dictionary?

In [119]:
#check type of J.R.R. Tolkien
type(authors.get("J.R.R. Tolkien"))

dict

In [120]:
#go one level deeper, check type of 'books'
type(authors.get("J.R.R. Tolkien").get("books"))

list

In [122]:
#go another level deeper what are the books this author has written?
authors["J.R.R. Tolkien"]["books"][1]

'The Two Towers'

In [123]:
# Slack: Code To get Suzanne Collins' 3rd book
authors["Suzanne Collins"]["books"][2]

'Mockingjay'

In [124]:
authors.get('Suzanne Collins').get('books')[2]

'Mockingjay'

In [125]:
authors.get("J.K. Rowling").items()

dict_items([('genre', 'fantasy'), ('books', ["The Sorcerer's Stone", 'The Chamber of Secrets', 'The Prisoner of Azkaban', 'The Goblet of Fire', 'The Order of the Phoenix', 'The Half-Blood Prince', 'The Deathly Hallows']), ('active', True), ('phone', {'home': '(281) 330-8004', 'work': '(800) HP0-TTER'})])

## `.items` gives you keys and values

## Booleans

![](assets/boole.jpg)

Booleans are variables that only have two different values: `True` and `False`. 

They're named after their founder, **George Boole** and will come in real handy when we discuss control flow this afternoon.

You can do three operations with booleans : `not`, `and`, and `or`.

In [126]:
# set a variable to True
dummy_variable = True

In [127]:
dummy_variable

True

In [128]:
#check it's type
type(dummy_variable)

bool

In [129]:
#set a variable to False
y = False

In [130]:
y

False

`not`: Simply gives the opposite

In [131]:
#set a variable to not True
z = not True

In [132]:
z

False

`and`: A and B only yields `True` if both A and B are true




In [133]:
#set three boolean values
sky_blue = True
grass_green = True
pigs_fly = False

In [134]:
#test the first two
sky_blue and grass_green

True

In [135]:
#test the next two
grass_green and pigs_fly

False

In [136]:
sky_blue or pigs_fly

True

`or`: `A or B` only yields `False` if both A and B are false

In [None]:
#test two with or


## Cool story, Boole
So what? We rarely actually define variables to be `True` or `False`. More often, we get them from asking Python math problems.

In [137]:
# Greater than
5 > 3

True

In [138]:
# Less than
5 < 3

False

In [139]:
# Greater than or equal to
5 >= 5

True

In [140]:
# Break it into parts WITHIN parentheses
(3 < 4) and (4 > 5)

False

In [141]:
# Not equal to
3 != 4

True

In [144]:
# Equal to
5 = 5

SyntaxError: cannot assign to literal (<ipython-input-144-0c0eb488f2ba>, line 2)

Single equals sign is assignment NOT checking whether two values are the same. 

In [145]:
5 == 5

True

Common error!

## Summary

We covered:

- Basic Jupyter Notebook use
- Basic math in Python
- String manipulation in Python
- Collection data types in Python
- Booleans in Python

## Check for Understanding

- Make a list
- Get the second to last item from the list
- Make a dictionary
- Get a value from the dictionary
- Make a string variable
- Reverse the string