In [2]:
%%html

<img src="https://avatars.githubusercontent.com/u/73504156?s=100&v=4" style="float: left; margin-right: 15px; width: 75px;">
<h1 style="margin-bottom: -1px; margin-top: 5px;">02 Intro to Python: Data Types</h1>
    <div>Data Science Fundamentals</div>
        <div style="float: left; margin-right: 5px;">
            David Yerrington
        </div>
    </div>
    <a href="http://www.github.com/dyerrington"><img src="https://snipboard.io/mFdILJ.jpg" style="width: 20px; display: inline-block;" /></a>
    <a href="http://www.linkedin.com/in/davidyerrington"><img src="https://snipboard.io/UZ3azp.jpg" style="width: 20px;"></a>
    <a href="https://discord.gg/aqCp7DVhWn"><img src="https://snipboard.io/NHlrIG.jpg" style="width: 20px;"></a>

</div>

## Objectives

Upon finishing this notebook, you will be able to:

- Tell to your friends how Python types work
- Identify the main Python types
- Explain the limitations and tradeoffs between Python types
- Be able to choose Python types based specific problems

Prerequisites
- Setup [Juptyer Lab](https://jupyterlab.readthedocs.io/en/stable/getting_started/overview.html) + [Python environemnt](https://www.anaconda.com/products/individual) (Anaconda is strongly recommended as your go to Python environment for data science.)
- Run Jupyter from the [command line interface](https://jupyterlab.readthedocs.io/en/stable/getting_started/starting.html) or using [Anaconda Navigator](https://docs.anaconda.com/anaconda/navigator/index.html)

## Intro

Using types are an essential part of any programming language. They define what values can be stored in variables, and how those values can interact with each other. In this module we'll look at the basic type system for Python, which is very similar to that used by most modern languages. We'll also see some examples of how Python's type system works.

Python has three main data types: numeric, sequential, and logical. These correspond roughly to integers (integers), strings/lists/dictionaries/sets (sequential types), and booleans (true or false). Also, some of these types can be immutable, meaning they cannot be changed once created which can be good thing depending the intended use cases.

A good analogy about Python types would be like comparing a building with a car. Both can have people in them, but only one of them can go from point A to point B while the other is just a box with a roof. You wouldn't want your car to be able to do all the things your house does, right?  Regardless, it's important to understand what each type does so you know how best to use them.

![image.png](attachment:615236b1-b2e6-4efa-af78-2476687f56bf.png)
> I may not be able to work this into the analogy but it is a pretty cool example of unecessary utility.


## Numeric Data Types
Let's start with numeric data types because they're the easiest to explain. Numeric data types are the base unit of information in any programming language. They are used for everything from storing numbers such as currency amounts, stock prices, and scientific measurements, to more abstract concepts like timestamps and matrices.

### Integer and Float

There are many different kinds of numerical data types, but we're going to focus on the most common ones. There are two main ways to represent numbers in Python: integer and float. An integer is a whole number.  For example, -100, -5, 12, 12345,  and 1000000000 are all integers. On the other hand, floating point numbers are represented using decimal notation. This means instead of having a whole number, you have a fractional number.  So -3.4 is a float whereas -3 is an integer.

![image.png](attachment:9b36ded4-be2a-45bb-98f9-9c181704d187.png)

> #### Did you know? (optional!)
>
> Sometimes I'll add these extra bits in notebooks that don't fit with the main "flow" of the topic module being presented.  These are opportunities to dig further on your own if you're curious, or completely skip.
>
>  _It is typically used when doing calculations involving very large or very small numbers. The exact value stored will depend on how much memory is available, but it will always be greater than 1 and less than 2147483647.  
What exactly is 2147483647?  It's the largest value that can be represented in a 32 bit signed integer, which is the smallest possible number in the double precision format.  If you wanted to represent the number 3.1415926535897932384626433832795028841971693993751058209747942384659215... in double precision, you would need to use 16,384 bits, which is the maximum amount of bits allowed in a 32 bit signed integer._
>
>_You might think that if you wanted to represent a number larger than 2147483647 then you would need to convert it into another data type. However, this isn't true. You can represent numbers up to 2147483647 without any issues in Python. The problem comes when you try to divide a number by itself. In order to do this, you must first multiply the number by its reciprocal which is 2147483648. Then you take the remainder of the division and add it to the original number._
>
> Do you crave this level of knowledge?  Check out [Introduction to Algorithms (ISBN: 978-0262033848)](https://www.amazon.com/Introduction-Algorithms-3rd-MIT-Press/dp/0262033844)


## Sequential Types

Another way to store information is through a sequence of items. Sequences are useful for storing text. For example, the string "hello platypus!" is a sequence of letters and spaces.

The most common type of sequence in Python is a list. A list is a container for multiple elements. Each element is identified by a unique index. For example, the list `[1,2,3,4]` is a list of four numbers. Lists can be nested inside of each other, allowing for a very flexible way to organize data.  We can store information like names, phone numbers, crypto prices, or even your favorite Pokemon in sequential types.

Sequential types are useful when you want to keep track of something. For example, if you wanted to remember your favorite pokemon, you could create a list called 'pokemon' and put all the PokÃ©mon you like in it.  Once you have created the list, you can access the contents of the list using the index number.

For example, if you wanted to write Python code that could scan your entire Pokemon list and find the Pokemon with the strongest combined attributes and then print its name, a sequential type would make this possible.

![image.png](attachment:74148860-1d0f-4c5d-b51a-f3eaecfba2fa.png)

### Logical Types

The last type of data type we'll cover with any depth is essentially the logical boolearn type.  The boolean type can be stored in memory but it can also be evaluated as the result of a logical operation. Logical values are usually used for boolean operations such as IF statements, OR conditions, AND conditions, NOT conditions, and so on.

For example, let's say you want to write a program that checks whether a person has hair or not.  One way to do this would be to create a string variable called `hair` and assign it either `'bald'` or `'not bald'`.  

> ### <img src="https://snipboard.io/E4VrLz.jpg" style="float: left; width: 24px; margin: -2px 5px; vertical-align: middle;"> Learning more about Jupyter
>
> In the Jupyter environment, it's not necessary to `print("something")` everytime you want to see the value of a specfic variable, object, or resulting operation.  Whatever you put on the last line in a cell, will be displayed in the result.  However, anything prior to it won't be displayed in the output below.
>
> Also, make a conscious effort to use all the hotkeys that are available.  Whenever it makes sense to use one of them, we will call them out and explain their usage prior to their usefulness.  A specific goal that you should consider is learning as many shortcuts and hotkeys when using Jupyter.  The less you touch your trackpad or mouse, the faster you will be able to express you ideas in Python but also be able to code faster.  The mouse and trackpad only slow you down.
>
> Lastly, get in the habbit of using `[cmd]-[enter]` on Mac, or `[ctrl]-[enter]` on PC to run the current cell where your cursor is in rather than the little "play" button at the top of the screen.

In [8]:
hair = "bald"
hair

'bald'

A better solution would be to create a new variable called `has_hair` and assign it a value of True or False which is easy for Python to evaluate logically and can only be one or the other.  

In [9]:
has_hair = False
has_hair

False

A string can be numbers and letter or blank.  Boolean is a much better choice because it makes your code more deterministic.

Using logical types like Booleans is a great way to make your code more readable but to store and evaluate conditions of a specific aspect of your data.  You could use a Boolean type for things like whether a user is logged in, if their account is active, or if their power level is over 9000.


## Functions and Classes

Now that we've covered the basics of Python types, let's take a look at functions and classes. Functions are used to perform a task. They are often used for tasks that need to be repeated.  For example, if you wanted to make sure that your Pokemon list was sorted correctly, you could create a function called sort_pokemon that takes a list as an argument and sorts it according to a certain criteria.

> #### A garden variety function in the wild
> ```python
def sort_pokemon(data, criteria = "defence"):
    return data.sort_values(criteria, ascending = false)
> ```

Classes are used to group together a set of related functions.  For example, you could create a class called Pokemon that contains functions to handle all of the things that are associated with a Pokemon.  These functions could include things like finding the Pokemon's type, name, weight, height, etc.

> #### A basic class you might see sometime
>```python
> class Pokemon:
>     
>     name = "Davechu"
>     poketype = "electric"
>     weight = "180"
>     
>     def __init__(self, **kwds):
>         for attribute, value in kwds.items():
>             setattr(self.attribute, value)
>     
>     def poke_party(self, party_type = "pizza"):
>         return self.name + " is having a " + party_type + " party!"
>     
>```

When writing programs, it's helpful to separate out the parts of your code that do the same thing.  Classes and functions are the backbone of Python but they also help us reuse code rather than write the same thing over and over.  We won't dive into this too deep until we start learning about Pandas and working with panel data types which all of these intro topics lead to.