# ![](https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png) Intro to Python: Datatypes

### LEARNING OBJECTIVES
*After this lesson, you will be able to:*
- Explain the origin and philosophy of Python
- Explain why it is the preferred choice of Data Scientists
- Define integers, strings, tuples, lists, and dictionaries
- Demonstrate arithmetic operations and string operations
- Demonstrate variable assignment

## What is Python?

>Python is a general purpose programming language created by Guido van Rosssum, aka "Benevolent Dictator for Life", in the early 1990s.

![](https://www.python.org/static/img/python-logo@2x.png)

<center>![](https://www.python.org/~guido/images/IMG_2192.jpg)<center>

>"[I]n December 1989, I was looking for a "hobby" programming project that would keep me occupied during the week around Christmas. My office ... would be closed, but I had a home computer, and not much else on my hands. I decided to write an interpreter for the new scripting language I had been thinking about lately"

## Why "Python"?

> "I chose Python as a working title for the project, being in a slightly irreverent mood (and a big fan of Monty Python's Flying Circus)."

## Philosophy

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


## Python is characterized as:

**High-level:**
- No need to dive into memory allocation, garbage collection, etc. Everything is an object.<br>

**Interpreted:**
- Compiled line by line at runtime

**Dynamic:**
- No need to declare each variable's type, e.g., int, float, string

**Strongly Typed**
- Will **not** automatically change a variable type, e.g.,  "1" + 2 = "12"

## Interpreted

In [1]:
print'hello'
print'world'

hello
world


## Dynamic

In [4]:
x = 1
y = 2

print(x + y)

3


## Strongly Typed

In [5]:
%%javascript
alert("1" + 2)

<IPython.core.display.Javascript object>

In [6]:
"1" + 2

TypeError: cannot concatenate 'str' and 'int' objects

In [7]:
"1" + str(2)

'12'

## Why Use Python for Data Science?

Wes McKinney, author of the pandas library, excerpted from Quora:

<small>
"Python's popularity for data science is largely due to the <b>strength of its core libraries (NumPy, SciPy, pandas, matplotlib, IPython)</b>, <b>high productivity for prototyping and building small and reusable systems</b>, and its <b>strength as a general purpose programming language</b>. Since data scientists are also often involved with wiring together network applications, programming for the web, scripting and automating data processing jobs and other processes, and lots of ad hoc data munging (the kind of stuff people loved using Perl for in the 90s), it's very desirable to be able to do all these things, in addition to the actual analysis and modeling, in a single language. Python also excels as a glue language for applications written in C, C++, and Fortran, especially using the excellent Cython [5] project. <b>Typically only a small part of an application you build will be slow</b> (call it a 90/10 or 95/5 rule); by using Python you can build a program very quickly, profile and identify bottlenecks, then optimize by using better array programming techniques or, <b>if necessary, reimplementing the bottlenecks in Cython or C/Fortran.</b>"
</small>

## Which version will we use?
>- We will be using the latest version in the 2.X line which is 2.7.11

## Why not version 3.X?
>- As of mid-2016, there is a roughly even split between 2.X and 3.X usage
> within the industry. Because not all libraries have yet been ported to 3.X,
> we chose to stick with 2.X. Overall, however, there is very little practical
> difference.

## What is this magical notebook you're using?
>- It's called the Jupyter notebook (formerly the Ipython notebook)
and it is magical. We'll learn more about it throughout the course. For now,
you just need to know Shift+Enter runs a cell's code and Esc+A or Esc+B adds
a cell above or below, respectively, the cell you are working in.

## Seriously though, tabs or spaces?

<center><img src="http://cimg.tvgcdn.net/i/r/2016/03/18/242d2375-8fb9-4772-b46a-b1f0ff4a2a07/thumbnail/1300x867/afc4dfbc3d5ab10fde649b2d193c7ab5/160318-news-silicon-valley.jpg" width=400 height=150></center>

## PEP-8 has all the answers to style questions...

> https://www.python.org/dev/peps/pep-0008/#tabs-or-spaces <br\>

> "Spaces are the preferred indentation method."<br>
> 4 to be exact

# Python Data Types

>Python has several "built-in" types:
- **numerics**
- **sequences**
- **mappings**
- files
- classes
- instances
- exceptions

## Numerics

>- **int**
>- **float**
>- long
>- complex

### 1) Integers - whole numbers, either positive or negative

In [8]:
4

4

### We can do math!

In [9]:
4+6

10

### We can assign ints to variables!

In [10]:
x = 786

In [11]:
y = 711

In [12]:
x + y

1497

### 2) Floats - numbers with a decimal point

In [13]:
3.2

3.2

## Sequences

>- **strings**
>- **tuples**
>- **lists**
>- unicodes
>- bytearrays
>- buffers

### Sequences: Strings

In [14]:
"string here"

'string here'

### Also assignable to variables

In [15]:
x = "string here"

In [16]:
x

'string here'

In [17]:
y = 'string here too'

In [18]:
y

'string here too'

### Let's talk about string operations

### concatenation (+)

In [19]:
x + y

'string herestring here too'

In [20]:
x + ' ' + y

'string here string here too'

In [21]:
print x + y

string herestring here too


In [22]:
print x,y

string here string here too


### String Multiplication (*)

In [23]:
y = 5

In [24]:
x * y

'string herestring herestring herestring herestring here'

## Exercise: We've seen that we can use the '+' with two strings and '*' with a string and a number, now take a moment and evaluate what happens with each of the the following:

- try '/' with two strings
- try '/' with a string and a number
- try '+' with a string and a number
- try '**' with a string and a number
- try '+' with two strings directly followed by '*' and a number

In [25]:
x / y

TypeError: unsupported operand type(s) for /: 'str' and 'int'

In [28]:
x ** 1


TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'

### String indexing

In [29]:
x

'string here'

### How do we access that first 's' in our string?

In [30]:
x[0]

's'

### What if we want more than just a single character?

In [31]:
x[0:2]

'st'

In [32]:
x[:2]

'st'

### Notice that this slicing is 'exclusive' - it returns the 0th and the 1st element, but not the 2nd

## What if we want to go backwards?

## Can we index from the right side?

## Yes, using the negative sign

In [33]:
x[-1]

'e'

### How would we get the last three characters using indexing from the right?

In [34]:
x[-3:]

'ere'

### Exercise: We've seen we can assign strings to variables and we can select pieces of a string with indexing and slicing. Take a moment and see if we can index a piece of our string and replace it with some oter characters. <br>
<br>
- Try replacing the last four characters, 'here' with 'there' - does it work?

In [39]:
x[-4:] 

'here'

### Sequences: Tuples - ( )

### Like strings tuples are immutable sequences, but unlike strings they can hold more than just characters. They are denoted with parens.

In [37]:
ext = ('Hello', 1, x)

In [38]:
ext

('Hello', 1, 'string here')

### How can we index that?

In [42]:
ext[0:2]

('Hello', 1)

## How do we slice it?

### Can we use concatenation?

In [43]:
ext + ('2','Monkey')

('Hello', 1, 'string here', '2', 'Monkey')

### Can we replace one of the items with another?

## 3) Sequences: Lists - [ ]

### A list is a mutable sequence, i.e., the items in it can be replaced. It is denoted with square brackets.

In [44]:
elst = ['Cat', 2, 'Horse', x]

In [45]:
elst

['Cat', 2, 'Horse', 'string here']

### Let's swap out the first item in our list

In [49]:
elst[:1] = 'bird'

In [50]:
elst

['b', 'i', 'r', 'd', 'i', 'r', 'd', 2, 'Horse', 'string here']

### How do we index 'Horse' in our list?

### How would we concat?

### Can we do this?

In [51]:
elst + 5

TypeError: can only concatenate list (not "int") to list

### What about this?

In [52]:
elst + [5]

['b', 'i', 'r', 'd', 'i', 'r', 'd', 2, 'Horse', 'string here', 5]

### What about the '*' operator?

In [53]:
elst * 3

['b',
 'i',
 'r',
 'd',
 'i',
 'r',
 'd',
 2,
 'Horse',
 'string here',
 'b',
 'i',
 'r',
 'd',
 'i',
 'r',
 'd',
 2,
 'Horse',
 'string here',
 'b',
 'i',
 'r',
 'd',
 'i',
 'r',
 'd',
 2,
 'Horse',
 'string here']

## Mappings

>- **dictionaries (dicts)**

### Mappings: Dicts - { } 

### Dictionaries are also know as key-value stores. Like lists they are mutable in that the values for a given key can be replaced. They are denoted with curly braces.

In [54]:
exd = {'key_a': 1, 'key_b': 'dog', 'key_c': [0,1]}

In [55]:
exd

{'key_a': 1, 'key_b': 'dog', 'key_c': [0, 1]}

### How can we retrieve an item by its key?

In [56]:
exd['key_a']

1

### But what we reference a key that doesn't exist?

In [57]:
exd['key_d']

KeyError: 'key_d'

### Can we check for a key without throwing an error?

In [58]:
exd.get('key_d')

### Another way to check

In [59]:
'key_a' in exd

True

In [60]:
'key_d' in exd

False

### We can add a new key/value pairs

In [61]:
exd['key_d'] = 2.1

In [62]:
exd

{'key_a': 1, 'key_b': 'dog', 'key_c': [0, 1], 'key_d': 2.1}

### Exercise:

### What can be a key? We know a string can be. Take a moment and check to see if the following can be dict keys:

- Numerics (int, float)
- Tuples
- Lists

Why or why not?

## Exercise:
Pair up with neighbor.
You're going to make a checklist for all the types we've covered: Strings, Tuples, Lists, Dictionaries
<br><br>Report whether they are:
- mutable
- concat-able (+)
- multiplicable

## Useful functions for data types

In [63]:
type(4)

int

In [64]:
type([0,1,2])

list

In [65]:
isinstance(4, int)

True

In [66]:
isinstance(3.2, float)

True

In [67]:
import sys

In [68]:
sys.getsizeof((0,0))

72

In [69]:
sys.getsizeof([0,0])

88