# Tutorial 1.3: Common Functions & Basic Data Types
Python for Data Analytics 
Module 1

Alright, it's time to actually start working with code! 

Previously we mentioned that Python is an object-oriented language. There are many such languages and something that most, if not all, of them have in common is the set of "built-in" basic types/classes of objects.

This is true for Python as well. In this tutorial we will briefly introduce most of Python's basic types that you will be dealing with throughout the rest of our course.

**Pythonista Tip: **Remember that objects come from blueprints called *classes* or *types*. It is sometimes confusing for beginners to remember which comes first and that classes & types basically mean the same thing in Python. 

## Common Functions
Before we look at the specifics of our basic data types, I'd like to introduce you to a number of functions that you will commonly use on all sorts of different objects. In order to demonstrate their use, I will need to create a simple object.

I'll create a `list` object, which you will read more about later in this tutorial.

In [2]:
# Here I'll create a simple list of three strings
cool_things = ['In-N-Out', 'Chipotle', 'Notre Dame']

In [3]:
# To get an object's type(or class) you pass it to the `type` function
type(cool_things)

list

In [4]:
# You can ask if a given object comes from a certain class/type
# with the `isinstance` function

# This asks, is the `cool_things` object a list?
isinstance(cool_things, list)

True

In [5]:
# Sequence and Container objects (objects that contain multiple elements)
# support the `len` function, which will tell you how many elements
# the sequence or container holds.

# Here will we use it to see how many strings are inside of our list
len(cool_things)

3

In [6]:
# Passing an object to the `dir` function will return 
# a list of attributes (data) and methods (operations)
# that are available on that object.
dir(cool_things)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

See how there are a lot of items in the output that start and end with double underscores? These are the so-called "magic methods" in Python. We can just ignore them, since you have to be a pretty advanced Pythonista to know how to use them.

**For our purposes in this course, just focus on the ones without underscores.**

In [7]:
# We've already shown how you can use the `?` operator to find
# out more about an object. As a reminder, you can execute this cell
# to see how it works.
cool_things?

In [8]:
# You can also get information on a given object using the `help`
# function. Using it on our `cool_things` list will tell you about
# all the methods available on a list object.
help(cool_things)

Help on list object:

class list(object)
 |  list(iterable=(), /)
 |  
 |  Built-in mutable sequence.
 |  
 |  If no argument is given, the constructor creates a new empty list.
 |  The argument must be an iterable if specified.
 |  
 |  Methods defined here:
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(...)
 |      x.__getitem__(y) <==> x[y]
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iadd__(self, value, /)
 |      Implement self+=value.
 |  
 |  __imul__(self, value, /)
 |      Implement self*=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate sign

In [9]:
# Many objects will support the ability to display some sort
# of string representation via the `print` function.
print(cool_things)

['In-N-Out', 'Chipotle', 'Notre Dame']


In [10]:
# This one is somewhat esoteric for the beginner.
# Every object is memory has a unique identifier.
# You can get this identifier using the id() function.

# This would allow you to see if two different labels refer
# to the same actual objects.

# Here I'll bind/assign another label to the `cool_things` list.
another_label = cool_things

# Now I'll compare their id() values.
id(another_label), id(cool_things)

# You'll see that both labels point to the same
# object in memory.

(140704346125448, 140704346125448)

The `input()` function is where you get started with interactive programs. It will display a textfield into which a user can type something.

You can assign the results of the function to a variable in order to make the value available to the rest of your program.

In [11]:
# Here I will capture whatever you enter into a variable
user_input = input()

hello


In [12]:
# Now I have access to this inside of the program
# and I'll print it back.
print(user_input)

hello


## Common Data Types/Classes
Alright, now that we've covered some commonly used functions, let's take a look at the specifics of some common data types.

### Strings
Objects that hold text data in Python have the  `str` type (short for string). Let's define a sample string object.

In [13]:
# Assign a `str` object to the `my_string` label.
my_string = 'Python is my friend'

In [14]:
# String are a sequence of (unicode) characters, so you can 
# use the `len` function to see how many characters are in them
len(my_string)

19

In [15]:
# You can print a string using the `print` function
print(my_string)

Python is my friend


In [16]:
# Using the `dir()` function on a `str` object, you'll
# see that they a MANY methods.
dir(my_string)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',


In [17]:
# Here are a couple of examples of using the methods that we
# see are available for str objects.

# Uppercase
print(my_string.upper())

# Capitalize
print(my_string.capitalize())

PYTHON IS MY FRIEND
Python is my friend


In [18]:
# Try using some additional string methods in this cell.
# Create additional cells to try as many as you'd like.

### Integers & Floats
In general, numbers are stored in two basic types: 
* *`int`*: For integer (or whole) numbers.
* *`float`*: For fractional numbers.

In [19]:
# Define an `int` and `float` example objects
my_int = 7811
my_float = 7811.5

In [20]:
# Display the types of each object
print(
  type(my_int), 
  type(my_float)
)

<class 'int'> <class 'float'>


In [21]:
# You can convert an `int` to `float` by using the `float()` constructor function.
float(my_int)

7811.0

In [22]:
# The opposite is also true, but note that you lose
# all decimal values when converting a `float` to an `int`

# Important!! Python does not perform rounding here
int(my_float)

7811

### Lists
So far, what we've covered has been pretty easy. This one sounds intimidating at first, but really isn't, so stick with me.

<strong>The standard mutable heterogeneous ordered multi-element container in Python is the `list` object type.</strong>

<div class="alert alert-block alert-info">
<h5>A Multi-Mutable What?</h5>
That is quite a mouthful isn't it. Let's take it one word at a time.
<ul>
<li>Container: A Python object that holds other objects.</li>
<li>Multi-Element: It can contain multiple (in practice unlimited) objects.</li>
<li>Ordered: Lists maintain the order of their elements.</li>
<li>Mutable: You can add/remove/change objects in the containter. The opposite is an *immutable* container.</li>
<li>Heterogeneous: The elements of a list can be of different types.</li>
</ul>
</div> 

In [23]:
# There are a few different ways to create `list` objects.
# Here is the simpliest way of doing so:
# You simply put a comma separated list of elements inside of `[]` brackets
my_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
my_list

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [24]:
# You can also create a list from any object that can be
# "iterated", or separated into pieces using the `list`
# constructor function.

# Here is an example of turning a string into a list of
# individual characters
list_of_characters = list("Monty Python & The Holy Grail")
list_of_characters

['M',
 'o',
 'n',
 't',
 'y',
 ' ',
 'P',
 'y',
 't',
 'h',
 'o',
 'n',
 ' ',
 '&',
 ' ',
 'T',
 'h',
 'e',
 ' ',
 'H',
 'o',
 'l',
 'y',
 ' ',
 'G',
 'r',
 'a',
 'i',
 'l']

In [25]:
# You can access the individual elements of a list by 
# their "index value". Index values in Python start at 0 
# (not 1), so the first element has a 0 index value.

# This retrieves the first element in the list
my_list[0]

0

You can also pass negative index values to retrieve list elements. Using negative values retrieves elements based on how far away they are from the end of the list.

In [26]:
# Pull the last element of the list
my_list[-1]

9

In [27]:
# Pull the second to last element of the list
my_list[-2]

8

You can retrieve multiple elements from a list using **slice notation**.

The basic form of slice notation looks like this: `list[start_index:finish_index]`. When used, all elements with indexes from the `start_index` value up to (but not including) the `finish_index` value will be returned.

Let's take a look at an example:

In [28]:
my_list[0:5]

[0, 1, 2, 3, 4]

In the example above, we specified two index values separated by a colon(:).  What this meant to Python was: *give me all elements of `my_list` whose index values are from 0 to 4*.

**Important**: Make sure to remember the slices do not include the `finish_index` element.


There is an extended form of slice notation which looks like this: `list[start_index:finish_index:interval]`. You can use this form to change the interval at which you pull elements out of the original list.

In [29]:
# An interval value of 2 would take every other element
my_list[0:5:2]

[0, 2, 4]

In [30]:
# While an interval value of 3 would take every third element
# and so on...
my_list[0:10:3]

[0, 3, 6, 9]

**Pythonista Tip: ** You can use slice notation on `str` objects as well. Technically, a string is a sequence/container of individual characters.


In [31]:
# Lists can contain different types of data
heterogeneous_list = [1, 'Spiderman: Homecoming', 2.1, 'Rogue One']
heterogeneous_list

[1, 'Spiderman: Homecoming', 2.1, 'Rogue One']

In [32]:
# You can add elements to a list with the `list.append()` method
my_list.append(10)
my_list

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [33]:
# You can remove elements from a list with the `list.pop()` method

# Remove the last element from a list
my_list.pop()
print(my_list)

# You can pass an index value to remove that specific element
my_list.pop(0)
print(my_list)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9]


In [34]:
# You can also use the `remove()` method
# Which will remove the first occurrence of a value
my_list.remove(3)
my_list

[1, 2, 4, 5, 6, 7, 8, 9]

In [35]:
# You can change (mutate) the value at a given list index.
# Here we will change the first element to 123
# You simply add the assignment operator `=` and a new value
# to your element selection
my_list[0] = 123
my_list

[123, 2, 4, 5, 6, 7, 8, 9]

In [36]:
# You can sort a list
my_list.sort()
my_list

[2, 4, 5, 6, 7, 8, 9, 123]

In [37]:
# Or reverse it
my_list.reverse()
my_list

[123, 9, 8, 7, 6, 5, 4, 2]

In [38]:
# You can ask Python if a given value exists
# in a list with the `in` operator

123 in my_list

True

### Sets
A set has many of the same characteristics as `list` objects. 

The most essential differences between them are: 
* A `set` will only allow a given value to exist once inside of it. 
* `set` objects do not have any sense of guaranteed order
* `set` objects have special methods to quickly compare the items they contain with other sets or lists

In [39]:
# Just like with a list, you can create a set object
# using short-hand syntax, this time with curly braces
# instead of square ones.

# Notice that in the input, I have included '3' twice
# but the second instance of the '3' will be dropped 
# as the set is constructed
my_set = {1, 2, 3, 3, 4, 5}
my_set

{1, 2, 3, 4, 5}

In [40]:
# You can also create `set` objects with the `set()` 
# constructor function that consumes some sort of iterable
# just like the `list()` function

# Here we will use it to generate a set from 
# the same string used above but note how
# duplicate letters are dropped and the 
# order is all out of whack
set("Monty Python & The Holy Grail")

{' ',
 '&',
 'G',
 'H',
 'M',
 'P',
 'T',
 'a',
 'e',
 'h',
 'i',
 'l',
 'n',
 'o',
 'r',
 't',
 'y'}

In [41]:
# Sets are awesome when you want to 
# compare/combine/substract one group of items
# from another. 

# To demonstrate, let's assume that we have 
# the following two sets of integers

set_one = {1, 2, 3, 5, 8}
set_two = {1, 3, 5, 7}

In [42]:
# You can add two sets together and get a unique
# list of elements from both sets with the `union()` method
set_one.union(set_two)

{1, 2, 3, 5, 7, 8}

In [43]:
# You can use the `intersection()` method to see which 
# elements belong to both sets
set_one.intersection(set_two)

{1, 3, 5}

In [44]:
# The `difference()` method allows you to identify
# elements that exist in one set, but not the other
set_one.difference(set_two)

{2, 8}

### Dictionaries
A Python *`dict`* object is a mapping between keys and values. Like a *`list`*, *`dict`* objects are also containers, but you access the specific elements with their 'key' rather than an index value.

<div class="alert alert-block alert-info">
<h4>Python's Evolving</h4>
<p>Historically, dictionaries did not maintain the order of the elements added and removed to/from it. However, starting with the upcoming Python 3.7, this feature will be a part of the language.</p>
<p>
    In our class, we are running Python 3.6 (since 3.7 isn't out yet). In this version of Python, dictionary element ordered is also maintained, but it was considered a "beta" feature.
</p>
</div> 

As with a `list` object, there are a couple of different ways to create `dict` objects.

In [45]:
# You can create an empty dict with the constructor function
# and then add elements (see below)
example_dict = dict()
example_dict

{}

In [46]:
# You can pass key/value pairs to the constructor function
example_dict = dict(name='Donald', species='Duck')
example_dict

{'name': 'Donald', 'species': 'Duck'}

In [47]:
# You can also use the curly braces syntax
# to quickly specify the key/value pairs.

# IMPORTANT: This looks similar to how you 
# constructed `set` objects in the previous
# section. Notice the key is separated from the 
# value by a colon in the `dict` form.
university_info = {
    'name': 'University of Notre Dame',
    'mascot': 'Fighting Irish',
    'city': 'Notre Dame',
    'state': 'Indiana'
}

**Pythonista Tip: **You can use both strings and numbers for keys in a dictionary.


In [48]:
# You access dictionary items with their key
university_info['name']

'University of Notre Dame'

In [49]:
# Dictionaries are also mutable.
# You can change the values associated with a given key
university_info['state'] = 'California'
university_info

{'name': 'University of Notre Dame',
 'mascot': 'Fighting Irish',
 'city': 'Notre Dame',
 'state': 'California'}

In [50]:
# You can add new elements to a dictionary by simply
# assigning a new value to a new key
university_info['country'] = 'United States'
university_info

{'name': 'University of Notre Dame',
 'mascot': 'Fighting Irish',
 'city': 'Notre Dame',
 'state': 'California',
 'country': 'United States'}

In [51]:
# Dictionaries also have the `pop()` method to remove items
# but you have to provide the key (not the value) you want to remove
university_info.pop('country')
university_info

{'name': 'University of Notre Dame',
 'mascot': 'Fighting Irish',
 'city': 'Notre Dame',
 'state': 'California'}

In [52]:
# You get retrieve just the keys or values of a dict

# keys only
print(university_info.keys())

# values only
print(university_info.values())

dict_keys(['name', 'mascot', 'city', 'state'])
dict_values(['University of Notre Dame', 'Fighting Irish', 'Notre Dame', 'California'])


In [53]:
# And you can update one dictionary with
# the values in another
dict_1 = {'name': 'Michael Dunn'}
dict_2 = {'age': '999'}

dict_1.update(dict_2)
dict_1

{'name': 'Michael Dunn', 'age': '999'}

### Booleans
A *`bool`* object has two possible values in Python: *`True`* or *`False`*. This type of object is often returned from conditional statements (like an *`if`* statement, where you are testing whether or not something is true).

In [54]:
# Here I will assign the result of the "greater than"
# conditional operator to the variable `my_boolean`
my_boolean = 5 > 4
my_boolean

True

In [55]:
# And correspondingly if I come up with an un-true statement
# the value of my `bool` object will become `False`
my_boolean = 4 > 5
my_boolean

False

In [56]:
# As always, you can check the object type
type(my_boolean)

bool

In [57]:
# 'True' boolean values are considered equal to 1
True == 1

True

In [58]:
# 'False' boolean values are considered equal to 0
False == 0

True

### `None` Object
The special object *`None`* represents the lack of a value. This is a bit esoteric without context, but just remember that this object exists. We will see it used practically as we move through our course.