# Data Types

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ali-rivera/Python-Support-Hours/blob/main/Week2/Week2_Resources.ipynb)

Welcome to Python!! Here we will review some of the most common Python Data Types that you'll likely use in your Data Science work.

I encourage you to use this resource as you get comforable with Python and reference it as needed.

## Data Types

There are several data types in Python that will be useful throughout your Python experience.

Some common data types are:
- strings `str`
    - a series of characters - can also contain numbers, but these are treated as a character (so no math with strings)
- integers `int`
    - an integer including negative values and 0
- floats `float`
    - a number with decimal places
- boolean `bool`
    - true or false values
- lists `list`
    - mutable and ordered
- tuples `tuple`
    - an ordered, immutable list with duplicates
- dictionary `dict`
    - an ordered*, mutable list with keys instead of indexes
- set `set`
    - an unorded, immutable list of items

There are additional data types out there as well as some that can be added with additional packages (like dataframes - a common data type from `pandas`, or arrays - a common data type from `NumPy`!)

**dictionaries are considered ordered in Python>3.7*

Here are some examples of each:

In [1]:
str1 = "This is a string!" 
int1 = 2 
float1 = 3.14 
bool1 = True 
list1 = ["string", 0, 1.23, False] 
tuple1 = ("lucky tuple", 7, 7, 7) 
dict1 = {"key1": "value 1", "key2": [1,2,3, "four"]} 
set1 = {"value1", 2, 2, True}



In [2]:
## Play this cell to see the value of each of the variables defined above

print("str:\t",str1)
print("int:\t", int1)
print("float:\t", float1) 
print("bool:\t", bool1)
print("list:\t", list1)
print("tuple:\t", tuple1)
print("dict:\t", dict1)
print("set:\t", set1)

str:	 This is a string!
int:	 2
float:	 3.14
bool:	 True
list:	 ['string', 0, 1.23, False]
tuple:	 ('lucky tuple', 7, 7, 7)
dict:	 {'key1': 'value 1', 'key2': [1, 2, 3, 'four']}
set:	 {True, 2, 'value1'}


*Note: the \t prints a tab: this is an escape character, there are several others & they are helpful for formatting. See the link below for more info.*
<https://www.w3schools.com/python/gloss_python_escape_characters.asp> 


### Mutable vs. Immutable

**Mutable** data types can change value after being declared. Mutable items include:
- `int`
- `float`
- `bool`
- `list`
- `dict`

In [3]:
## A mutable example

change_list = [1, 2, "X", 4]
change_list

[1, 2, 'X', 4]

In [4]:
change_list[2] = 3
change_list

[1, 2, 3, 4]

**Immutable** items can not have values change after being declared. Mutable items include:
- `str`
- `tuple`
- `set`

In [5]:
change_str = "strXng"
change_str

'strXng'

In [6]:
#try replacing the X with an i:

change_str[3] = "i" # produces an error
change_str

TypeError: 'str' object does not support item assignment

In [7]:
set1.remove("spicy water")

KeyError: 'spicy water'

In [8]:
set1

{2, True, 'value1'}

### Ordered vs. Unordered

**Ordered** data types have a set "order" that can be used to index the elements by position. Ordered data types include:
- `str`
- `list`
- `dict`*
- `tuple`

*dictionaries are indexed by keys

**Unorded** data types do not have an order and can not be indexed. A `set` is unordered.


In [9]:
# ordered
str1[0]

'T'

In [10]:
# unorded
set1[0] # produces error

TypeError: 'set' object is not subscriptable

### Indexing

Remember Python indexing starts at 0, so the following string is indexed as follows:

| H   | e   | l  | l  | o  |  _  | w  | o  | r  | l  | d  |
|-----|-----|----|----|----|----|----|----|----|----|----|
| 0   | 1   | 2  | 3  | 4  | 5  | 6  | 7  | 8  | 9  | 10 |


You can also use negative indexing, which starts at -1 at the end. This can be helpful when trying to remove a file type (like .txt or .doc) from a strings of different lengths. 

| H   | e   | l  | l  | o  |  _  | w  | o  | r  | l  | d  |
|-----|-----|----|----|----|----|----|----|----|----|----|
| -11 | -10 | -9 | -8 | -7 | -6 | -5 | -4 | -3 | -2 | -1 |


In [11]:
## To get an item using the index, use []
list1[0]

'string'

In [12]:
## To get the index of an item, use the .index() function
list1.index("string") # note: this returns the first index in the event of duplicate values

0

You can also perform *slicing* using indexing to select a section using indexing. You can specify the staring point (inclusive) and the ending point (exclusive) (ex: `str1[start:end]`), or use a blank on either side of the : to signify 'the rest'.

In [13]:
str2 = "Hello world"
# print hello only
print(str2[:5])
print(str2[0:5])
print(str2[:-6])

# print last 3 characters
print(str2[8:])
print(str2[8:11])
print(str2[-3:])


Hello
Hello
Hello
rld
rld
rld


## Next Steps

The best way to get better at coding is to do it. Try these problems from W3Schools to practice using different data types and associated functions. As you go, practice looking up documentation for the things you don't know or aren't sure of.

https://www.w3schools.com/python/exercise.asp

# Additional Resources

Python Cheat sheets from Data Camp: https://www.datacamp.com/cheat-sheet/getting-started-with-python-cheat-sheet 

*Data camp is also a great resource for additional instruction, examples, and practice problems. This is a paid service but the cheatsheets are free!*
