# Python 101 @ SzISz III.

---

In [None]:
from szisz import *
BASE = '../data/'

---

## Deus ex Python

You've downloaded a series complete with subtitles but the video and  
subtitle filenames don't match! Write a function which renames the  
mismatching subtitles!

hint(s) - useful functions:

- `download(_name, _season, _episodes, _mismatch)`
- `string.lower()` (built-in)
- `find_episode_number(filename)`
- `rename_subtitle(original, new, target_dir)`


---

## File I/O

Reading from a file is really easy:

In [None]:
# we need a filename and a mode
filename = BASE + 'text.txt'

In [None]:
# and a mode
mode = 'r' # r stands for reading
# and we have to open the file for reading
my_file = open(filename, mode) 
# we can iterate over on it's lines directly:
for line in my_file:
    print line

In [None]:
my_file.seek(0, 0) # help(file.seek)
# or read every line into a list:
lines_as_list = my_file.readlines()
print lines_as_list

In [None]:
my_file.seek(0, 0)
# or read the whole file as string:
lines_as_string = my_file.read()
print lines_as_string

In [None]:
# we can do it either way... BUT!
# DO NOT FORGET TO CLOSE IT once you finished working with it!
my_file.close()

Pretty easy, huh? What about writing into a file?

In [None]:
mode = 'w' # as you can guess, w stands for writing ;)
my_file = open(filename, mode) 
# we can write into it directly:
my_file.write('You take the red pill, you stay in Wonderland, '
              'and I show you how deep the rabbit hole goes...')
# again, don't forget to close the file
my_file.close()

There is more! Do you feel cumbersome to open and close the file?  
Good news: You do not have to worry about at all!  

In [None]:
mode = 'r' 
with open(filename, mode) as my_file: 
    for line in my_file.readlines(): 
        print line 
# aaaaand it's closed ;)

Can we add content to existing files?

In [None]:
# Yes, we can!
mode = 'a' # a stands for append
with open(filename, mode) as my_file:
    my_file.write('Remember, all I\'m offering is the truth, nothing more...')

---

## CSV files

But... We want to read in some CSV files. Do we really need to do all  
the hassle with the commas, quotations and all that bs?

In [None]:
# ofc not! someone already wrote that for us!
import csv
filename = BASE + 'text.csv'

Read it!

In [None]:
mode = 'r'
with open(filename, mode) as my_file:
    # we have to create a csv reader in order to read
    # and we have to specify the delimeter, and the quotecharacter
    # or the dialect.
    my_csv = csv.reader(my_file, delimiter=';', quotechar='"')
    # we can read out the rows easily from the file
    for row in my_csv:
        # you get each row as a list
        print row

Write it!

In [None]:
mode = 'w'
with open(filename, mode) as my_file:
    # we'll need a writer
    # the arguments are the same as before
    my_csv = csv.writer(my_file, delimiter=';', quotechar='"')
    # we need some data to save:
    data = [['Smith', 'Smith', 'Smith', 'Smith'],
            ['Smith', 'Smith', 'Smith', 'Smith']]
    # then write each row into the file,
    # one-by-one
    for row in data:
        my_csv.writerow(row)

---

## Unicode madness

Writing in exotic languages can cause problems, and we need to handle them.  
Originally we could only select from 128 characters to work with.

In [None]:
print_image('http://www.asciitable.com/index/asciifull.gif', 'net')

But then the problem was addressed with the unicode character set.  
It currently contains more than 100k characters - including the   
complete kanji set, the klingon and the elf alphabet as well. Long  
story short, we should use utf-8 character encoding when working with  
text files.

In [None]:
# We need a built-in python module
import codecs

In [None]:
filename = BASE + 'unicodetext.txt'
mode = 'r'
encoding = 'utf-8'
# and use it's functions to work with files:
with codecs.open(filename, mode, encoding) as my_unicode_file:
    content = u'\n'.join(my_unicode_file.readlines())
print content
print repr(content)
print type(content)

In [None]:
mode = 'w'
with codecs.open(filename, mode, encoding) as my_unicode_file:
    my_unicode_file.write(u'Árvíztűrő tükörfúrógép')

In [None]:
# Represent a unicode sting in ascii
ascii_content = content.encode('utf-8')
print ascii_content
print repr(ascii_content)
print type(ascii_content)

In [None]:
# Represent an ascii sting in unicode
unicode_content = ascii_content.decode('utf-8')
print unicode_content
print repr(unicode_content)
print type(unicode_content)

---

## Let's see how how deep the rabbit hole goes!

Write our fake "download" function

---

Merge the matching rows.  

- Read the data from the "matching.csv"
- Add the numerical values together in the rows with matching ID values
- Concatenate the string values and separate them with  `' & '`