<a target="_blank" href="https://colab.research.google.com/github/lukebarousse/Python_Data_Analytics_Course/blob/main/1_Basics/03_Data_Types.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Data Types

## Overview

### Notes

#### What are Data Types?

* Python can store data of different types
* These types can do different things

#### What can we do with Data Types?

Well, with `int` we can do math operations like `+` and `-`.

In [1]:
total_salary = 110000
bonus_salary = 10000

In [2]:
type(total_salary)

int

In [3]:
base_salary = total_salary - bonus_salary

base_salary

100000

However, that's doesn't mean we can do the same with `str`.

In [4]:
job_title = 'Data Analyst'
remove_word = 'Data'

In [5]:
type(job_title)

str

In [6]:
final_title = job_title - remove_word

final_title

TypeError: unsupported operand type(s) for -: 'str' and 'str'

#### Built-in Types

Python has the following data types built-in by default:
| Type               | Data Type           | Example               |
|--------------------|---------------------|-----------------------|
| Text               | `str`               | `"Data Nerd!"`     |
| Numeric            | `int`               | `42`                  |
|                    | `float`             | `3.14159`             |
|                    | `complex`           | `1 + 2j`              |
| Sequence           | `list`              | `[1, 2, 3]`           |
|                    | `tuple`             | `(1, 2, 3)`           |
|                    | `range`             | `range(10)`           |
| Mapping            | `dict`              | `{"key": "value"}`    |
| Set                | `set`               | `{1, 2, 3}`           |
|                    | `frozenset`         | `frozenset([1, 2, 3])`|
| Boolean            | `bool`              | `True` or `False`     |
| Binary             | `bytes`             | `b"Data"`            |
|                    | `bytearray`         | `bytearray(5)`        |
|                    | `memoryview`        | `memoryview(b"Data")`|
| None               | `NoneType`          | `None`                |

For more information on the different data types in Python check out the Python documentation for data types [here](https://docs.python.org/3/library/datatypes.html).

#### Types Common in Data Analytics

We'll mainly focus on the most common ones in data analytics:
* Text Type: `str`
* Numeric Types: `int`, `float`
* Sequence Types:	`list`, `tuples`
* Mapping Type:	`dict`
* Set Types:	`set`

### Importance

Fundamental for data processing. Pandas and matplotlib automatically handle different data types for operations like mathematical calculations, data manipulation, and plotting.



## View Data Type


### Notes

* View the data type using the `type()` function


### Example

In [7]:
company_name = "DataWiz Inc."

type(company_name)

str

You can also use functions inside of other functions.

In [8]:
print(type(company_name))

<class 'str'>


**Note**: We really don't need to use `print()` here as Jupyter Notebooks always print the last object.

## Check Data Types

Check the data types of the variables using `type()` function.

Note: In this case we are using `print()` so we can see all of the data types of our variables. 

In [27]:
job_id = 102
company_name = 'Data Nerd, Inc.'
job_title = 'Data Scientist'
salary_rate = 170000.00
job_work_from_home = True

In [28]:
# Check data types
print(type(job_id))
print(type(company_name))
print(type(job_title))
print(type(salary_rate))
print(type(job_work_from_home))

<class 'int'>
<class 'str'>
<class 'str'>
<class 'float'>
<class 'bool'>


You can also check to see if a variable is the type you think it is. Using `isinstance()`. It has two arguments: 
1. *Object* - what you want to check
2. *Type* - a type (what you want to use to see the object is)

For example, if I wanted to check if the `job_id` is a `float` type. If it is the type you put in, it will return `True` if not, it will return `False`. 

In [29]:
isinstance(job_id, float)

False

In [30]:
isinstance(company_name, str)

True

Foreshadowing: We'll be using `isinstance()` in the advanced and project chapters 😈

## Get `help()` on Data Types (or really anything)

We can use the `help()` function to investigate data types, functions, methods... really any objects!


In [31]:
help(isinstance)

Help on built-in function isinstance in module builtins:

isinstance(obj, class_or_tuple, /)
    Return whether an object is an instance of a class or of a subclass thereof.
    
    A tuple, as in ``isinstance(x, (A, B, ...))``, may be given as the target to
    check against. This is equivalent to ``isinstance(x, A) or isinstance(x, B)
    or ...`` etc.



Using the `help()` function we can see what "methods" are available to a data type.

A "method" in Python is like a tool that performs a specific task on or with an object.  

For instance, a string object has a method `.upper()`, which you can use like `your_string.upper()`. This takes `your_string` and turns all its letters to uppercase. It's like asking the string to explain itself.

In [32]:
help(str) # Check documentation of objects including: str, int, float, bool

Help on class str in module builtins:

class str(object)
 |  str(object='') -> str
 |  str(bytes_or_buffer[, encoding[, errors]]) -> str
 |  
 |  Create a new string object from the given object. If encoding or
 |  errors is specified, then the object must expose a data buffer
 |  that will be decoded using the given encoding and error handler.
 |  Otherwise, returns the result of object.__str__() (if defined)
 |  or repr(object).
 |  encoding defaults to sys.getdefaultencoding().
 |  errors defaults to 'strict'.
 |  
 |  Methods defined here:
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __format__(self, format_spec, /)
 |      Return a formatted version of the string as described by format_spec.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  

## Set Data Type

### Notes

* Set the specific data type using constructor *functions*.
* First, what is a *function*? We'll be diving into functions a lot more later. But right now understand that it's a block of code designed to do a specific task. 
    * It can take input (known as arguments), execute a series of operations, and return an output. 
    * We've already used a function several times now called `print()`. It outputs the argument (the string of text you've been typing in) to the console. 
        * For example `print('hello world')`. Will print out 'hello world'. 
        * `print()` focuses on displaying output. 
    * But most functions perform calculations, manipulate data, or return data. As you'll see below. 

Below are some common data types:

| Data Type | Syntax                     | Example                           |
|-----------|----------------------------|-----------------------------------|
| String    | `str()`                    | `str('Hello World')`              |
| Integer   | `int()`                    | `int(20)`                         |
| Float     | `float()`                  | `float(20.5)`                     |
| Complex   | `complex()`                | `complex(1j)`                     |
| List      | `list()`                   | `list(('apple', 'banana', 'cherry'))` |
| Tuple     | `tuple()`                  | `tuple(('apple', 'banana', 'cherry'))`|
| Range     | `range()`                  | `range(6)`                        |
| Dictionary| `dict()`                   | `dict(name='John', age=36)`       |
| Set       | `set()`                    | `set(('apple', 'banana', 'cherry'))`  |
| Frozenset | `frozenset()`              | `frozenset(('apple', 'banana', 'cherry'))`|
| Boolean   | `bool()`                   | `bool(5)`                         |
| Bytes     | `bytes()`                  | `bytes(5)`                        |
| Bytearray | `bytearray()`              | `bytearray(5)`                    |
| Memoryview| `memoryview()`             | `memoryview(bytes(5))`            |

### Example

Set the specific data type for each variable. Let's specify a new job posting.

🪲**Debugging**

**This is an intentional mistake**

This is used to demonstrate debugging.

Error: Missing the single quote `'` at the end of Data Nerd, Inc.

```
company_name = str('Data Nerd, Inc.)
```

Steps to Debug:

1. Look at the actual error, can you tell what the problem is?
2. If not, then look it up:
  1. Use a chatbot like ChatGPT or Claude
  2. Look it up using Google

In [33]:
job_id = 102

type(job_id)

int

#### What if we want job_id to have decimal places instead?

In [34]:
# Set job_id to a float value vice an integer
job_id = float(102)

print("Job ID:         ", job_id)
print("Type of Job ID: ", type(job_id))

Job ID:          102.0
Type of Job ID:  <class 'float'>
