# Python Collections

* Here is a <font color='blue' face="Courier New" size="+1">list</font> of numbers
  * We can add and change elements as well as extract elements individually or in slices
  * slices are either a single element or range of elements in square brackets <font color='blue' face="Courier New" size="+1">[ ]</font> after the variable name

In [None]:
numbers = [4, 2, 2, 10, 12, 3, 7, 12]
print(numbers)         # print the entire list, note the square brackets in the output
print(numbers[0])      # print the first element, 0 is the first element
print(numbers[2])      # print the third element, because it starts with 0
print(numbers[2:])     # print the range of elements starting with element 2 (the third element) to the end, note the square brackets
print(numbers[:2])     # print the range from the start (element zero) up to but NOT including element 2
print(numbers[2:-2])   # print the range starting at element two up to but NOT including the second from the end (negative 2)

[4, 2, 2, 10, 12, 3, 7, 12]
4
2
[2, 10, 12, 3, 7, 12]
[4, 2]
[2, 10, 12, 3]


* A <font color='blue' face="Courier New" size="+1">list</font> is also mutable (changeable)
  * We can change, add, remove and sort elements


In [None]:
numbers = [4, 2, 2, 10, 12, 3, 7, 12]
numbers[0] = 5         # change the first element to a value of 5
print(numbers)

numbers.remove(2)      # remove the first element with a value of 2
print(numbers)

numbers.append(20)     # add a value of twenty to the end of the list
print(numbers)

del numbers[1]         # delete element 1 from the list
print(numbers)
print('sorted', sorted(numbers)) # print a list of the numbers sorted, but don't change the list itself
print('original', numbers)

numbers.sort()         # this will mutate or change the list order
print(numbers)

[5, 2, 2, 10, 12, 3, 7, 12]
[5, 2, 10, 12, 3, 7, 12]
[5, 2, 10, 12, 3, 7, 12, 20]
[5, 10, 12, 3, 7, 12, 20]
sorted [3, 5, 7, 10, 12, 12, 20]
original [5, 10, 12, 3, 7, 12, 20]
[3, 5, 7, 10, 12, 12, 20]


* A <font color='blue' face="Courier New" size="+1">list</font> can contain any kind of data
  * If all the elements are the same data type we call it homogeneous
  * If they are different we call it heterogeneous


In [None]:
# homogeneous list of strings
names = ['Sri', 'Alex', 'Martha']
print(names[1])

# heterogenous list of different types
costs = [10, 12.3, 10.02, 'ten']
print(costs)


Alex
[10, 12.3, 10.02, 'ten']


## Try it now: ##
#### 1.	Create a simple list of five strings.
#### 2.	Print the first and last element.
#### 3.	Change the second element to something different.
#### 4. Sort the list and display it.
<br>
<details><summary>Click for <b>hint</b></summary>
<p>
1.	Make a list of colors or names. Use <font color='blue' face="Courier New" size="+1">[ ]</font> and a quoted list of values separated by commas.
<br>
2.	Use <font color='blue' face="Courier New" size="+1">[0]</font> for the first element and <font color='blue' face="Courier New" size="+1">[-1]</font> for the last element.
<br>
3.	Use <font color='blue' face="Courier New" size="+1">[1]</font> for the second element and an <font color='blue' face="Courier New" size="+1"> = </font> to change the value.
<br>
4.	Use the <font color='blue' face="Courier New" size="+1">sort</font> method not the <font color='blue' face="Courier New" size="+1">sorted</font> function.
<br>
<br>
</p>
</details>

<details><summary>Click for <b>code</b></summary>
<p>

```python
colors = ['red', 'blue', 'green', 'yellow', 'white']
print(colors[0], colors[-1])
colors[1] = 'black'
colors.sort()
print(colors)
```
</p>
</details>

red white
['black', 'green', 'red', 'white', 'yellow']


* Similar to lists are sets and tuples
  * A <font color='blue' face="Courier New" size="+1">set</font> makes sure all the elements in the collection are unique
    * we use curly braces <font color='blue' face="Courier New" size="+1">{ }</font> to surround the elements
  * A <font color='blue' face="Courier New" size="+1">tuple</font> is just like a <font color='blue' face="Courier New" size="+1">list</font> but are not mutable, so once we assign values to it, we can't change them
    * we use parantheses <font color='blue' face="Courier New" size="+1">( )</font> to surround the elements
    * immutablity is slightly more efficient in some cases

In [19]:
set1 = {1, 2, 3, 4, 4, 5, 6, 3}
print(set1)

tuple1 = ('a', 1, 'b')
print(tuple1[0], tuple1[-1], tuple1[0:2])
# but you cannot change the tuple so the following code would fail
# tuple1[0] = 'z'

{1, 2, 3, 4, 5, 6}
a b ('a', 1)


* A dictionary or <font color='blue' face="Courier New" size="+1">dict</font> is another kind of collection that takes key/value pairs
* It's sort of like field names and values
  * Often used as a way to represent a record from something list a SQL table
* It also uses curly braces <font color='blue' face="Courier New" size="+1">{ }</font> like a <font color='blue' face="Courier New" size="+1">set</font> does, but it uses a colon <font color='blue' face="Courier New" size="+1">:</font> to separate the key and value
* They are mutable like lists and sets but instead of using a position to access a single element we use the key to fetch its value
  * Keys are usually strings but don't have to be

In [16]:
record = {'firstname': 'Adam', 'lastname': 'Smith', 'age': 50}
print(record)
print(record['firstname'])     # print the value for the element names firstname
record['lastname'] = 'Jones'   # change the elemenbt for the key lastname to a new value
record['gender'] = 'M'         # add a new key and value
print(record)
print(record.keys())           # print a list of just the keys
print(record.values())         # print a list of just the values
del record['gender']           # remove a key from the dictionary
print(record)

{'firstname': 'Adam', 'lastname': 'Smith', 'age': 50}
Adam
{'firstname': 'Adam', 'lastname': 'Jones', 'age': 50, 'gender': 'M'}
dict_keys(['firstname', 'lastname', 'age', 'gender'])
dict_values(['Adam', 'Jones', 50, 'M'])
{'firstname': 'Adam', 'lastname': 'Jones', 'age': 50}


* Even strings are similar to lists and we can treat each character of the string the same way we do each element of a collection
* That means we can get individual elements or slices with square bracket syntax just like lists


In [20]:
name = 'Han Solo'
print(name[0:3])

Han


* Generally we use collections in combination with a <font color='blue' face="Courier New" size="+1">for</font> loop to do something to each element of the collection


In [22]:
names = ['Luke', 'Leia', 'Han', 'Ben']
for name in names:
  print(name.upper())

LUKE
LEIA
HAN
BEN


## Challenge for later: ##
### Collections in Python are very powerful and we could spend days going through all the possibilities
### Some additional topics to discover on your own would be:
* <a href="https://www.geeksforgeeks.org/python-map-function">map</a>
* <a href="https://www.learnpython.org/en/List_Comprehensions">list comprensions</a>


* Combining all these types of collections is possible and can get pretty tricky fast
* Below we have a list of dictionaries which looks a lot like JSON
* It also is sort of like a table of data, but it's harder to visualize and process

In [23]:
data = [{'team':'Leicester', 'player':'Vardy', 'goals':24}
        ,{'team':'Manchester City', 'player':'Aguero', 'goals':22}
        ,{'team':'Arsenal', 'player':'Sanchez', 'goals':19}]
print(data)

[{'team': 'Leicester', 'player': 'Vardy', 'goals': 24}, {'team': 'Manchester City', 'player': 'Aguero', 'goals': 22}, {'team': 'Arsenal', 'player': 'Sanchez', 'goals': 19}]


* So there's a more advanced set of modules that come to the rescue
  * <a href="https://pypi.org/project/numpy/">numpy</a> is a more efficient way to store lists of numbers and offers many built in math functions
  * <a href="https://pypi.org/project/pandas/">pandas</a> is a package that creates a <font color='blue' face="Courier New" size="+1">dataframe</font> that is similar to a SQL table but in memory

* Functions are saved blocks of code that have a name, take some parameters, and return a value.
* There are many built-in functions.


In [None]:
print(len('abcde'))
print(abs(-10))


* Let's use a simple <font color='blue' face="Courier New" size="+1">if</font> statement to do something different based on what the user enters.

In [None]:
name = input('Enter your name? ')
if len(name) <= 4:
  print(f'{name}, your name if short.')
else:
  print(f'{name}, your name is long.')
# Sometimes we want to repeat a process multiple times, so Python has several ways to do this.
# In general, these are called loops.


* Sometimes we want to repeat a process multiple times, so Python has several ways to do this.
* In general, these are called loops.
* It's kind of like an if statement that says run the indented block if the condition is <font color='blue' face="Courier New" size="+1">True</font>
but then it goes back to the top of the <font color='blue' face="Courier New" size="+1">while</font> loop and tries it again and again until the condition is <font color='blue' face="Courier New" size="+1">False</font>
* Be careful, we need to make sure that we do something inside the loop that eventually makes the condition <font color='blue' face="Courier New" size="+1">False</font> or we end up with an infinite loop.

In [None]:
x = 10
while x > 0:
  print(x)
  x = x - 1


* Another way to loop is to use a <font color='blue' face="Courier New" size="+1">for</font> statement.
* This will loop through each element in a range or collection (more about that later).


In [None]:
for x in range(10):
  print(x)


* The <font color='blue' face="Courier New" size="+1">range</font> function generates values between 0 and 9; it goes up to but does not include 10.
* The following code would print values between 1 and 10.


In [None]:
for x in range(1, 11):
  print(x)

* The <font color='blue' face="Courier New" size="+1">help</font> function can be useful to get details on how functions work.

In [None]:
help(range)

## Try it now: ##
#### Based on what you learned from getting help on range, can you write a loop that counts down from 5 to 1?
<br>
<details><summary>Click for <b>hint</b></summary>
<p>
Use -1 for the step parameter by supplying a third value inside the parentheses.
</p>
</details>

<details><summary>Click for <b>code</b></summary>
<p>

```python
for x in range(5, 0, -1):
  print(x)
```
</p>
</details>

* Sometimes function are in another module, so to load that module into memory we use the <font color='blue' face="Courier New" size="+1">import</font> command.


In [None]:
import math
print(math.sin(90))

* Some functions are built into the variables directly, we usually call these <font color='blue'>methods</font>.


In [None]:
name = 'Han Solo'
print(name.upper(), name.lower(), name.replace(' ', '-'))

* Let's put it all together and make a complete program.

In [None]:
import random

print('Welcome to our little program.')
name = input('Enter your name? ')
play = True
tries = 0
correct = 0

while play:
  number1 = random.randint(1, 10)
  number2 = random.randint(1, 10)
  correct_answer = number1 + number2
  answer = input(f'What is {number1} + {number2} = ')
  answer = int(answer)
  tries = tries + 1
  if answer == correct_answer:
    correct += 1
    print(f'Good job {name}! You got {correct} out of {tries} right for a score of {100 * correct/tries:.0f}.')
  else:
    print(f'Sorry {name}. The answer is {correct_answer}. You got {correct} out of {tries} right for a score of {100 * correct/tries:.0f}.')

  again = input('Do you want to play again? (Y/N)? ')
  if again.upper() == 'N':
    play = False



* Sometimes <font color='blue' face="Courier New" size="+1">if</font> conditions can get complicated, so we can combine multiple conditions with <font color='blue' face="Courier New" size="+1">and</font> & <font color='blue' face="Courier New" size="+1">or</font> and use <font color='blue' face="Courier New" size="+1">elif</font>


In [None]:
x = 10
y = 20
if x == 10 and y >= 20:
  print('One')
elif x < 10 or y <= 20:
  print('Two')
else:
  print('Three')




* We can write our own functions to save some complex logic into a named block and reuse it.
* We define a function with the <font color='blue' face="Courier New" size="+1">def</font> statement.
* Functions take parameters (inputs values) that are like variables and allow us to pass values into the function when we call it.

In [None]:
def square(x):
  return x * x

# The following tests the function to see if it worked
# Note that it is not indented, so it's not contained in the body of the function definition
print(square(10))


* Here's a function that takes two parameters.
* Note that both are required when we call the function, and if we leave one out it will give us an error.


In [None]:
def rectangle(x, y):
  return x * y

print(rectangle(10, 20))

* We can make parameters optional by assigning them a default value when we declare the function.
* Then if we don't supply a value, it uses the default.
* If we do supply a value, it overrides the default with what we supply.


In [None]:
def rectangle(x, y = 1):
  return x * y

print(rectangle(10))

In [None]:
print(rectangle(x = 2, y = 3))

## Try it now: ##
#### The code block below defines a function to calculate the volume of a cuboid. Call it by passing just an x value, or an x and y value, and then just an x and z value.
<br>
<details><summary>Click for <b>hint</b></summary>
<p>
For the first two, you can use the positions to pass the values.
For the third one, though, you would need to use names of the parameters to pass them in.<br>
</p>
</details>

<details><summary>Click for <b>code</b></summary>
<p>

```python
def cuboid(x = 1, y = 2, z = 3):
  return x * y * z

print(cuboid(2))
print(cuboid(2, 3))
print(cuboid(2, z = 4))
```
</p>
</details>

In [None]:
def cuboid(x = 1, y = 2, z = 3):
  return x * y * z



* There's already a lot of functions defined, you just have to find them in either a built-in Python module or sometimes you need to download and install a module from the internet or another team in our organization.

* The <font color='blue' face="Courier New" size="+1">os</font> module is a standard part of Python so we can just import it.
* It has a lot of functions to interact with the operating system.

In [None]:
import os
print(os.getcwd())
os.mkdir('newfolder')
print(os.listdir(os.getcwd()))


* Nobody remembers every function and module so a lot of the time we either use the Python <font color='blue' face="Courier New" size="+1">help</font>  function, look it up on the internet, or use AI to help find what we're looking for.
* Let's try the AI option. Click the <font color='blue'>Generate</font> button on the cell below.
* Click the generate link in the code block below and enter the following: <font color='blue'>write python code to check if a folder exists</font>

* This code checks if a folder exists and if not, create it in the current working folder.

In [None]:
import os
print(os.getcwd())
if not os.path.exists('newfolder'):
  os.mkdir('newfolder')

print(os.listdir(os.getcwd()))


* This code goes through every file in a folder but only shows it if it's a CSV file.

In [None]:
for file in os.listdir('sample_data'):
  if file.endswith('.csv'):
    print(file)

* There's already a lot of functions already defined, you just have to find them in either a built-in Python module or sometimes you need to download and install a module from the internet or another team in your organization.
* <a href="http://pypi.org">PyPi</a> is a website that hosts a lot of open source packages.
* Some are commonly used by everyone, others can be just a personal project someone posted.
* Be careful and make sure you understand what you're downloading.


In [None]:
! pip install converterpro

In [None]:
from converterpro import weight_converter
grams = weight_converter.Pounds(1.0).convert_to_grams()
print(grams)


## Challenge for later: ##
#### Write a program to go through each file in the sample_data folder and copy it to a CSV folder, JSON folder, or other folder based on the file's extension. Make sure to create the folders if they are not already there.
<br>
<details><summary>Click for <b>hint</b></summary>
<p>
You will need to import a library we've not seen yet, so either use the AI feature or look up how to copy a file in Python.
<br>
Use variables to build the path names for the source and destination files.
<br>
Confirm it worked by looking at the directories' content when you're done.
</p>
</details>

<details><summary>Click for <b>code</b></summary>
<p>

```python
import shutil, os
pwd = os.getcwd() + '/sample_data'
for file in os.listdir('sample_data'):
  if file.endswith('.csv'):
    path = pwd + '/csv'
  elif file.endswith('.json'):
    path = pwd + '/json'
  else:
    path = pwd + '/other'

  if not os.path.exists(path):
    os.mkdir(path)
  shutil.copy(pwd + '/' + file, path)


print(os.listdir(pwd + '/csv'))
print(os.listdir(pwd + '/json'))
print(os.listdir(pwd + '/other'))
```
</p>
</details>