- so far, everything we did was stored within the program inside the memory
- as soon as the program exits, everything is lost


In [None]:
names = []

for _ in range(3):
  names.append(input("What's your name? "))

for name in sorted(names):
  print(f"hello, {name}")


What's your name? Ondrej
What's your name? Anezka
What's your name? Silvia
hello, Anezka
hello, Ondrej
hello, Silvia


- All these names that I input will be lost with closing the program
- I want to save the names somehow into a file
- keyword __open__


In [None]:
name = input("What's your name? ")

# I want the file to has this name and with "w" I specify that I'd like to write to it
# if the file does not exist yet, this will create it for me
file = open("names.txt", "w")
file.write(name)
file.close()


What's your name? Hermione


- I input a name
- the name gets stored into a name variable
- the program opens (or even creates) a file
- the program writes the name into the file
- the program closes the file (saves it?)
- the problem is each time I'm running the program, it replaces the previous name
- I need to rather append values

In [None]:
name = input("What's your name? ")

# I change w->a, which means "append"
file = open("names.txt", "a")
file.write(name)
file.close()


What's your name? Ron


- I can now see that I have inserted all the names, but it's like HermioneHarryRon
- would be better to somehow add new line after each name

In [None]:
name = input("What's your name? ")

file = open("names.txt", "a")
file.write(f"{name}\n")
file.close()


What's your name? Ron


- sometimes it's not difficult to forget to close files
- it then leads to unsaved files etc.
- we can use the __with__ keyword to do it better


In [None]:
name = input("What's your name? ")

with open("names.txt", "a") as file:
  file.write(f"{name}\n")


What's your name? Hermione


- With the code above what happens is, that if I have more code down below (outside of the with context - indent), the file closes automatically

### Reading the files

In [None]:
with open("names.txt", "r") as file: # "r" for read
  lines = file.readlines() # we have this method which purpose is read all the lines of the file, which we then store in a variable

for line in lines:
  #print("hello,",line) # by default there are going to be gaps between each rows, cuz print() has end="\n" by default and I have /n on the end of lines in my txt file
  #print("hello,",line,end="") # possible solution
  print("hello,",line.rstrip()) # we strip off the end of the line

hello, Harry
hello, Ron
hello, Hermione


I can do it even more ellegant with (but I have a problem with sort)

In [None]:
# the ellegant solution with problem of sort
with open("names.txt", "r") as file:
  for line in file:
    print("hello,",line.rstrip())


hello, Harry
hello, Ron
hello, Hermione


In [None]:
# sorting it
names = []

with open("names.txt") as file:
  for line in file:
    names.append(line.rstrip())

for name in sorted(names):
  print(f"hello, {name}")

hello, Harry
hello, Hermione
hello, Ron


- This is a very common technique when dealing with files (information in general) when I want to change that data in some way (sort it e.g.), to create some kind of variable and then do something with the list
- Now let's simplify it a little bit: (this would not be optimal if I wanted to modify the names)

In [None]:
with open("names.txt") as file:
  for line in sorted(file):
    print("hello.",line.rstrip())

hello. Harry
hello. Hermione
hello. Ron


## Working with CSV files
- so I created a new file students.csv, where there is always a student, house logic
- I need to split each row by the delimeter = comma

In [None]:
with open("drive/MyDrive/Colab Notebooks/Files/students.csv") as file:
  for line in file:
    row = line.rstrip().split(",") # This will always create a list with n values
    print(f"{row[0]} is in {row[1]}")

Hermione is in Gryffindor
Harry is in Gryffindor
Ron is in Gryffindor
Draco is in Slytherin


- we can unpack it into 2 variables immediately like this:

In [None]:
with open("drive/MyDrive/Colab Notebooks/Files/students.csv") as file:
  for line in file:
    name, house = line.rstrip().split(",")
    print(f"{name} is in {house}")

Hermione is in Gryffindor
Harry is in Gryffindor
Ron is in Gryffindor
Draco is in Slytherin


If I wanted to sort it

In [None]:
students = []

with open("drive/MyDrive/Colab Notebooks/Files/students.csv") as file:
  for line in file:
    name,house = line.rstrip().split(",")
    students.append(f"{name} is in {house}")

for student in sorted(students):
  print(student)

Draco is in Slytherin
Harry is in Gryffindor
Hermione is in Gryffindor
Ron is in Gryffindor


- However, in the code above, I'm sorting by the whole english sentences .. not optimal
- it'd be better if we could just sort by the name

In [None]:
students = []

with open("drive/MyDrive/Colab Notebooks/Files/students.csv") as file:
  for line in file:
    name,house = line.rstrip().split(",")
    student = {}
    student["name"] = name
    student["house"] = house
    students.append(student)

print(students) # just to see how it looks like - it's a list of dictionaries

for student in students:
  print(f"{student['name']} is in {student['house']}")

[{'name': 'Hermione', 'house': 'Gryffindor'}, {'name': 'Harry', 'house': 'Gryffindor'}, {'name': 'Ron', 'house': 'Gryffindor'}, {'name': 'Draco', 'house': 'Slytherin'}]
Hermione is in Gryffindor
Harry is in Gryffindor
Ron is in Gryffindor
Draco is in Slytherin


I can simplify the assigning of values:

In [None]:
students = []

with open("students.csv") as file:
  for line in file:
    name,house = line.rstrip().split(",")
    student = {"name": name, "house": house}
    students.append(student)

for student in students:
  print(f"{student['name']} is in {student['house']}")

Hermione is in Gryffindor
Harry is in Gryffindor
Ron is in Gryffindor
Draco is in Slytherin


Further improvement - I need to sort it somehow
- I basically need to tell python to sort the list by looking at this key at each dictionary
- e.g. sort the list by looking at the name key in each dictionary

In [2]:
students = []

with open("drive/MyDrive/Colab Notebooks/Files/students.csv") as file:
  for line in file:
    name,house = line.rstrip().split(",")
    student = {"name": name, "house": house}
    students.append(student)

# I define a function that return a particular student's name from a dictionary "student"
def get_name(student):
  return student["name"]
# If I get it correctly???, this function will return a vector of names
# This vector of names??? is then used as a key for sorting

# Here I say Use a key that's equal to whatever the return value of get_name is
# python allows you to pass functions as arguments into other functinos
# get_name is a function, sorted is a function, and I'm passing get_name to sorted as a value of the "key" parameter
# notice that I use get_name instead of get_name()
# sorted function will use the value of key (get_name in this case) calling that function on
#   every dictionary in the list it's supposed to sort
# that function get_name returns the string that sorted will actualy use to decide about the order
# it alphabetizes things based on the return value
for student in sorted(students, key = get_name, reverse = True):
  print(f"{student['name']} is in {student['house']}")


# I can create some other functions that I can use for sorting, e.g.
"""
def get_house(student):
  return student["house"]
"""


Ron is in Gryffindor
Hermione is in Gryffindor
Harry is in Gryffindor
Draco is in Slytherin


'\ndef get_house(student):\n  return student["house"]\n'

If you're defining something (variable, function, ..) and then immediately using it, but never once again needing the name of the function (like get_name),, I can use __lambda function__
- lambda function is anonymous

In [None]:
students = []

with open("drive/MyDrive/Colab Notebooks/Files/students.csv") as file:
  for line in file:
    name,house = line.rstrip().split(",")
    student = {"name": name, "house": house}
    students.append(student)

# The key value (the lambda function) is equivalent to the get_name function in the code above
for student in sorted(students, key = lambda my_student: student["name"], reverse = True):
  print(f"{student['name']} is in {student['house']}")
# this says hey python there's a function but it has no name, it's anonymous
# that function takes a parameter, here I call it "my_student"
#    this function is called on every one of the students (dicts) in the "students" list
# so "student" is the variable that's iterating (student is becoming the dictionaries)
# for each student (each iteration) I want to access the name
# what do I want the lambda function to return?
#   - given a student, I want to index into the dictionary and access their name
#   - so that the string Hermione, Harry, etc. is ultimately returned and that's what the sorted function use to decide how to sort the dictionaries


Hermione is in Gryffindor
Harry is in Gryffindor
Ron is in Gryffindor
Draco is in Slytherin


###CSV Library

- Let's add another file students2.csv that will contain more columns
- the problem is, that I have an address on the first line that contains a comma
- it's wise to use csv module to manipulate with csv files
  - the module should have many tools developed how to deal with various problems
  - in this case the __problem__ is that in the csv file, the whole string containing the comma as a value is surrounded by ""
  - so I'd need to tell python something like "split around ',' but only the ',' that are not within a ""
  - this is exactly what the module csv will help me with



In [13]:
import csv

students = []

with open("drive/MyDrive/Colab Notebooks/Files/students2.csv") as file:
  reader = csv.reader(file)
  for row in reader:
    print(row)

['Harry', 'Number Four,Privet Drive']
['Ron', 'The Burrow']
['Draco', 'Malfoy Manor']


- the code above and printing ROW is just to see how the row looks like
- now I need to parse the data somehow

In [16]:
# first option
import csv

students = []

with open("drive/MyDrive/Colab Notebooks/Files/students2.csv") as file:
  reader = csv.reader(file)
  for row in reader:
    students.append({"name": row[0], "home": row[1]})

# just for clarity
for row in students:
  print(row)

# the actual output
for student in sorted(students, key = lambda my_student: student["name"]):
  print(f"{student['name']} is from {student['home']}")

{'name': 'Harry', 'home': 'Number Four,Privet Drive'}
{'name': 'Ron', 'home': 'The Burrow'}
{'name': 'Draco', 'home': 'Malfoy Manor'}
Harry is from Number Four,Privet Drive
Ron is from The Burrow
Draco is from Malfoy Manor


In [17]:
# second option
import csv

students = []

with open("drive/MyDrive/Colab Notebooks/Files/students2.csv") as file:
  reader = csv.reader(file)
  for name, home in reader:
    students.append({"name": name, "home": home})

# just for clarity
for row in students:
  print(row)

# the actual output
for student in sorted(students, key = lambda my_student: student["name"]):
  print(f"{student['name']} is from {student['home']}")

{'name': 'Harry', 'home': 'Number Four,Privet Drive'}
{'name': 'Ron', 'home': 'The Burrow'}
{'name': 'Draco', 'home': 'Malfoy Manor'}
Harry is from Number Four,Privet Drive
Ron is from The Burrow
Draco is from Malfoy Manor


### Opening a csv file with header row
- often csv files do have first row/column as names
- I am going to use not csv.reader(), but csv.DictReader(), which will iterate through the file top to bottom, loading in each line of text not as a list of columns, but as dictionary as columns
- this will give me automatic access to those column names

In [21]:
# now just to see how the DictReader behaves, how does the outcome looks like
import csv

students = []

with open("drive/MyDrive/Colab Notebooks/Files/students3.csv") as file:
  reader = csv.DictReader(file)
  for row in reader:
    print(row)


{'name': 'Harry', 'home': 'Number Four, Privet Drive'}
{'name': 'Ron', 'home': 'The Burrow'}
{'name': 'Draco', 'home': 'Malfoy Manor'}


In [25]:
# now just to see how the DictReader behaves, how does the outcome looks like
import csv

students = []

with open("drive/MyDrive/Colab Notebooks/Files/students3.csv") as file:
  reader = csv.DictReader(file)
  for row in reader:
    students.append({"name": row["name"], "home": row["home"]})

for student in sorted(students, key=lambda my_student: student["name"]):
  print(f"{student['name']} is from {student['home']}")


Harry is from Number Four, Privet Drive
Ron is from The Burrow
Draco is from Malfoy Manor


A huge benefit of using DictReader and loading the row names here is that even if someone changed order of columns, the code would still be working well

##Writing CSVs
- the following section will not work properly here because I have the file not prepared with header row !!!!!!!!!!!!

In [28]:
import csv

name = input("What's your name? ")
home = input("Whhere's your home? ")

with open("students4.csv", "a") as file:
  writer = csv.writer(file)
  writer.writerow([name,home])

# if someone changes order of columns in my csv file, this will no longer work
# it would write name into home and home into name column

What's your name? A
Whhere's your home? B


I could use csv.DictWriter to solve for the problem
- it will open the file in a similar way
- instead of writing the row as a list of [name,home], it does something else
- I am specifying to which column it should write which value
- now if order of columns get mixed up, the program will still write values into the right columns
- for this to work I need to specify 1 more argument in DictWriter: fieldnames
  - I pass a list of columns that are present in the csv file
  - this probably needs to be in a right order!!

In [29]:
import csv

name = input("What's your name? ")
home = input("Whhere's your home? ")

with open("students4.csv", "a") as file:
  writer = csv.DictWriter(file, fieldnames=["name","home"]) # I am using a csv.DictWriter now
  writer.writerow({"name": name, "home": home})



What's your name? Ondrej
Whhere's your home? Harry


## IMAGE FILES
- imagine I have 2 image files in my folder
  - costume1.gif
  - costume2.gif

In [None]:
import sys

from PIL import Image

images = []

for arg in sys.argv[1:]: #slicing .. I don't want the program file
  image = Image.open(arg)
  images.append(image)

#saving this file to disk (it comes from the PIL library, which handles opening and closing automatically)
images[0].save(
    "costumes.gif", #name of my final animated gif
    save_all=True,  #save all of the frames that I pass to it (first costume, second, ...)
    append_images=[images[1]], # I'm appending the second image to the first image
    duration=200, # 200ms for each of the frames
    loop=0 #time=0 .. it's going to loop infinite number of times
)

