# Reading and Writing text files

<img src="https://miro.medium.com/max/670/1*wPqqYFfNreXF4INrNhYkeQ.jpeg" width=300>

##  Steps: Reading from text files..
* Create a file handle
* Read from the file handle
* Close the file handle

## Example Reading

In [1]:
filePath = '../data/teams.csv'
fileHandle = open(filePath, 'r')
# read the whole file into a single string
fileContents = fileHandle.read()
print(fileContents)

id,name,arena,city,abbrev,teamName,location,inceptionYear,division,conference
1,New Jersey Devils,Prudential Center,Newark,NJD,Devils,New Jersey,1982,Metropolitan,Eastern
2,New York Islanders,Barclays Center,Brooklyn,NYI,Islanders,New York,1972,Metropolitan,Eastern
3,New York Rangers,Madison Square Garden,New York,NYR,Rangers,New York,1926,Metropolitan,Eastern
4,Philadelphia Flyers,Wells Fargo Center,Philadelphia,PHI,Flyers,Philadelphia,1967,Metropolitan,Eastern
5,Pittsburgh Penguins,PPG Paints Arena,Pittsburgh,PIT,Penguins,Pittsburgh,1967,Metropolitan,Eastern
6,Boston Bruins,TD Garden,Boston,BOS,Bruins,Boston,1924,Atlantic,Eastern
7,Buffalo Sabres,KeyBank Center,Buffalo,BUF,Sabres,Buffalo,1970,Atlantic,Eastern
8,MontrÃ©al Canadiens,Bell Centre,MontrÃ©al,MTL,Canadiens,MontrÃ©al,1909,Atlantic,Eastern
9,Ottawa Senators,Canadian Tire Centre,Ottawa,OTT,Senators,Ottawa,1990,Atlantic,Eastern
10,Toronto Maple Leafs,Scotiabank Arena,Toronto,TOR,Maple Leafs,Toronto,1917,Atlantic,Eastern
12,Caro

## Dealing with Unicode

<img src="https://miro.medium.com/max/800/1*ITKH_QSt8SRJQguCiHAtMQ.png" width=300>

* [slides from pycascades on unicode](https://docs.google.com/presentation/d/17xwPZrnGo5xGUXf_HkxFUTAE2SPisHQd7LcRWyYCL6I/edit#slide=id.p)
* [video from pycascades on unicode](https://www.youtube.com/watch?v=2U9EHYqc59Y)
* Bottom line: Use **UTF-8**

If you executed the code above you may have noticed that Montreal Canadians looked like: **MontrÃ©al Canadiens**

## Reading a file with specified encoding

**Note**: Will go over the `import` statement in more detail later.

In [2]:
import codecs
filePath = '../data/teams.csv'
fileHandle = codecs.open(filePath, 'r', 'utf-8')
# read the whole file into a single string
fileContents = fileHandle.read()
print(fileContents)

id,name,arena,city,abbrev,teamName,location,inceptionYear,division,conference
1,New Jersey Devils,Prudential Center,Newark,NJD,Devils,New Jersey,1982,Metropolitan,Eastern
2,New York Islanders,Barclays Center,Brooklyn,NYI,Islanders,New York,1972,Metropolitan,Eastern
3,New York Rangers,Madison Square Garden,New York,NYR,Rangers,New York,1926,Metropolitan,Eastern
4,Philadelphia Flyers,Wells Fargo Center,Philadelphia,PHI,Flyers,Philadelphia,1967,Metropolitan,Eastern
5,Pittsburgh Penguins,PPG Paints Arena,Pittsburgh,PIT,Penguins,Pittsburgh,1967,Metropolitan,Eastern
6,Boston Bruins,TD Garden,Boston,BOS,Bruins,Boston,1924,Atlantic,Eastern
7,Buffalo Sabres,KeyBank Center,Buffalo,BUF,Sabres,Buffalo,1970,Atlantic,Eastern
8,Montréal Canadiens,Bell Centre,Montréal,MTL,Canadiens,Montréal,1909,Atlantic,Eastern
9,Ottawa Senators,Canadian Tire Centre,Ottawa,OTT,Senators,Ottawa,1990,Atlantic,Eastern
10,Toronto Maple Leafs,Scotiabank Arena,Toronto,TOR,Maple Leafs,Toronto,1917,Atlantic,Eastern
12,Carolin

## Writing Files

<img src="https://thornleyfallis.com/wp-content/uploads/2016/08/fountain-pen-image.jpg" width=300>

* in the following example we will write to a file

In [None]:
# writing a line to a text file
fh = open('../data/junk.txt', 'w')
# notice the \n character... that's a carriage return.
fh.write("writing this line to the text file\n")
fh.close()

## Using `with` operator

<img src="https://contenthub-static.grammarly.com/blog/wp-content/uploads/2017/02/BEAR-WITH-ME-.jpg" width="500">

*"The with statement simplifies exception handling by encapsulating common
preparation and cleanup tasks."*

* puts the file operation in a code block
* automatically calls `close()` for you

## Example:

In [None]:
with open('../data/junk.txt', 'w') as fh:
    fh.write("writing this line to the text file\n")

## Exercise:
* create a function that recieves a file path
* the function opens the file and...
* iterates over every line in the file...
* identify lines in the file that represent teams from the 'Central' division...
* creates a new list for teams in the central division...
* after all the central teams have been collected they are sorted...
* returns the sorted list
* call the function


In [None]:
# add code to open the file data/junk.txt
# iterate over each line
# split each line into list
# create a new list with the teams in the  'Central' division


# File Paths in Python Scripts

<img src="https://lh3.googleusercontent.com/bMoj2UYYUrnU-ulBae1CFAnnieD0GC1SIeEZg4KpPFOgO-bC-8C5VRlpxuf69uWYo26T0XGt9IkY1tVIsniq72GxXHfldqOtjZAKQdirIwvu3tmCQxqsi1JsskiigfTMVWV3hEywvgOw_jDuiOI_tefmKqqjBbGBwIozUt8l1gPWCjUif0593wzse_1QHZmsSYjcxhxACwi0q58tEglTcEF2AxgxOPauYn_WZP__JnriStSD-xaQw6uIXa0ZIXAHbLyPCle5jSBAuPjjFxS8gNjGyVbFxcAhhOpOgU30kv86kqoJI1Wbdqd-EAw2SwtxZvXH9KRV-QBtyJJAoA2VUvUX1oRqJtZ-eOpj3iHgwRYExDYkYUZSQwkj65eDZWD2I3689fGI0vz0m07bSGK8PoW7NoUjJ4_YZBP2LwE1EnxuYqXD5hxE-Bzo0Y2EOmR3mjOyPFsohX7H3wMT7T42PHhy666zLDSB6CDIqb6QNjaYexk0HlPgEFgIC5QYZwN55LXIbtNrE0eRygm-rvJpV-6EBxUK_c0AF3s1IuNeaYKoDwQqK_S368ZKt63Jrdr10LikXkC_dhJjY9EpJpd6sE8ZTg2VYEIwwH5ga380zuXHbahxbdBtvv26Ij80ZOSvv2oiY5QyUqpZMgG7hh1p7AKEhegyaSmMFM_aMbR2tby5ALi0orfqHlzgZPIq1whnPpWzWwFeuU2rXHpXmjtJJgfBVG9r-2zSwa2KZHsOgJwOUVzQ=w1267-h713-no" width="500">

* frequently you want to construct file paths programatically.
* add / join to path
* get directory
* get just file name from path

**DO NOT** do this using string methods.

**INSTEAD** use the python module [os.path](https://docs.python.org/3/library/os.path.html#module-os.path)

## Commonly used methods:
* [basename()](https://docs.python.org/3/library/os.path.html#os.path.basename) - gets just the file
* [join()](https://docs.python.org/3/library/os.path.html#os.path.join) - join two paths together
* [dirname()](https://docs.python.org/3/library/os.path.html#os.path.dirname) - gets the sub directory
* [exists()](https://docs.python.org/3/library/os.path.html#os.path.exists) - does path exist
* [split()](https://docs.python.org/3/library/os.path.html#os.path.split) - returns two parts the last path and the rest

# Exercise / Demo:



In [None]:
import os.path
startPath = r'C:\Kevin\proj\python_training'

# add a directory to the path:
path2Lesson1Dir = os.path.join(startPath, '01_ToolOrientation')
print(f"path2Lesson1Dir {path2Lesson1Dir}")

# does the directory exist
if os.path.exists(path2Lesson1Dir):
    print(f'path2Lesson1Dir: {path2Lesson1Dir} exists')
else:
    print(f'path2Lesson1Dir: {path2Lesson1Dir} DOES NOT exist')

# Create this path to a file
# C:\Kevin\proj\python_training\01_ToolOrientation\hello.py
helloPyPath = os.path.join(startPath, '01_ToolOrientation','hello.py')
print(f"helloPyPath: {helloPyPath}")

# get just hello.py from the path
justHelloPy = os.path.basename(helloPyPath)
print(f"just the hello.py file: {justHelloPy}")

# just the directory that from the fullpath to hello.py
helloDir = os.path.dirname(helloPyPath)
print(f'the enclosing directory for hello.py file is: {helloDir}')

# doing the same thing in one line
helloDir, justHelloPy = os.path.split(helloPyPath)
print(f"hello dir: {helloDir}")
print(f"hello file: {justHelloPy}")


## Pathlib (newer alternative to os.path)

<img src="https://upload.wikimedia.org/wikipedia/en/thumb/3/36/Alttentacleslogo.jpg/180px-Alttentacleslogo.jpg" width=150>

* seems to be the way things are headed... 
* more complex to deal with thans os.path
* [Docs for pathlib](https://docs.python.org/3/library/pathlib.html)
* [a Zealot advocates for pathlib](https://treyhunner.com/2018/12/why-you-should-be-using-pathlib/)

# File Management (os and shutil)

* `os.path` allows you to programatically work with paths.
* we will now look at how you can manage files using python

# [os module](https://docs.python.org/3/library/os.html)

<img src="https://i.pinimg.com/originals/7c/69/c0/7c69c00c727280d39018699c9345c329.jpg" width=200>

The description on from official docs:

*This module provides a portable way of using operating system dependent functionality. If you just want to read or write a file see [open()](https://docs.python.org/3/library/functions.html#open), if you want to manipulate paths, see the [os.path](https://docs.python.org/3/library/os.path.html#module-os.path) module, and if you want to read all the lines in all the files on the command line see the fileinput module. For creating temporary files and directories see the [tempfile](https://docs.python.org/3/library/tempfile.html#module-tempfile) module, and for high-level file and directory handling see the [shutil](https://docs.python.org/3/library/shutil.html#module-shutil) module.*

## Useful os module functionality

What follows is a very small subset of the functionality in the os module

* [os.mkdir](https://docs.python.org/3/library/os.html#os.mkdir) - create directory, subdir must exist
* [os.makedirs](https://docs.python.org/3/library/os.html#os.makedirs) - create the full path, including sub directories
* [os.remove](https://docs.python.org/3/library/os.html#os.remove) - delete a file
* [os.rmdir](https://docs.python.org/3/library/os.html#os.rmdir) - remove a single directory
* [os.environ](https://docs.python.org/3/library/os.html#os.environ): access to environment variables 
* [os.walk](https://docs.python.org/3/library/os.html#os.walk): recursively navigate a filesystem

### Exercise / Demo

### mkdir / makedirs / remove / rmdir

In [1]:
import os
# create a directory path
newDir = os.path.join('.', 'junkdir')
newDir = os.path.abspath(newDir)
print(f'full path to dir to create: {newDir}')
if not os.path.exists(newDir):
    os.mkdir(newDir)
    print(f"created single dir: {newDir}")

# add a bunch more paths to newDir
newDirSubPaths = os.path.join(newDir, 'somedir', 'anotherdir', 'andAnother')
if not os.path.exists(newDirSubPaths):
    os.makedirs(newDirSubPaths)
    print(f"created a bunch of sub dirs..: {newDirSubPaths}")

# create an empty file
subDirWithFile = os.path.join(newDirSubPaths, 'somefile.txt')
with open(subDirWithFile, 'w') as fh:
    fh.write("add one line to the file\n")
    print(f'created the file: {subDirWithFile}')

# use os.remove to delete the file
if os.path.exists(subDirWithFile):
    os.remove(subDirWithFile)
    print(f"delete the file: {subDirWithFile}")

# delete the dir andAnother
if os.path.exists(newDirSubPaths):
    os.rmdir(newDirSubPaths)
    print(f"deleted the directory: {newDirSubPaths}")

# delete all the dirs:
if os.path.exists(newDir):
    os.rmdir(newDir)
    print(f"deleted the dir and subdirs {newDir}")
    # ERROR!

full path to dir to create: c:\Kevin\proj\python_training\04_Files\junkdir
created single dir: c:\Kevin\proj\python_training\04_Files\junkdir
created a bunch of sub dirs..: c:\Kevin\proj\python_training\04_Files\junkdir\somedir\anotherdir\andAnother
created the file: c:\Kevin\proj\python_training\04_Files\junkdir\somedir\anotherdir\andAnother\somefile.txt
delete the file: c:\Kevin\proj\python_training\04_Files\junkdir\somedir\anotherdir\andAnother\somefile.txt
deleted the directory: c:\Kevin\proj\python_training\04_Files\junkdir\somedir\anotherdir\andAnother


OSError: [WinError 145] The directory is not empty: 'c:\\Kevin\\proj\\python_training\\04_Files\\junkdir'

### Walk a file system

#### Example:

In [2]:
import os
for root, dirs, files in os.walk("."):
    if 'venv' not in root and '.git' not in root:
        print(f'rootdir: {root}, has {len(files)} and {len(dirs)} sub directories')


rootdir: ., has 4 and 1 sub directories
rootdir: .\junkdir, has 0 and 1 sub directories
rootdir: .\junkdir\somedir, has 0 and 1 sub directories
rootdir: .\junkdir\somedir\anotherdir, has 0 and 0 sub directories


### os.environ and os.uname

* a common security practice is to put secrets into environment variables and have code get them from there.
* setting up VSCode so it automatically populates env vars.

In [3]:
import os
# print environment variables
cnt = 0
for envName in os.environ:
    # filter to env names that start with v to reduce output, only print first 3
    if envName[0].upper() == 'V' and cnt < 4:
        print(f"env var: {envName} = {os.environ[envName]}")
        cnt +=1 

env var: VBOX_MSI_INSTALL_PATH = C:\Program Files\Oracle\VirtualBox\
env var: VERBOSE_LOGGING = true
env var: VIRTUAL_ENV = C:\Kevin\proj\python_training\venv
env var: VSCODE_CLI = 1


# shutil - High level os functions

<img src="https://lh3.googleusercontent.com/it-6sJvplgxuKN5YpKaSWCn58N1u44fsBsyE9qPWd4lg6ti6hkV7EcthBfcG-5uwSv3W3CjToFa4CCa-EA_OW8jrNE3CP-BI3tMejLGjcRy4g866KOMmdsQNkPs8uJzV19ra9BVqFZQr5HY4lL5EQsWOs4up66j0POo6T7SFyt65Pwh4-fJlgo67Vvk2XOKjukJKwGZ1DrjkIY8IBhlm3npEJWrMP4Y6ShWocpM1equDwmG0Zxo5ScBb5UezG2e1MF9OM0VHm4v-WSUB0LlKhFsaB8VHboZtRJAs_MttwoyxUumrnZFxuL8PFugeEcp1rWVLCiAbWtRUeNViF65BRK0Q1ypVaDPGTttDfG6kjgJQ9eGdOdIyQFxZW3oY142emfRRWXhvOaAAChLSGOk7mMKVDaIl01kIIaCB8oVMDEyvvhF22BL6vvmLtIupQX5kTo-nhxZD_BJN9RxUUc8oZpPXw_zVDrJmjLB0OeZEJp76aHxISe6QB23bjp1EleHKObVw0kPT9n3Qvrr1g1cr47zt1TI-DIxhAQr2w5EtDO0OPT6H_D_M0lcsMaxSZrkj1Gmcm-DzjyApCX-b1cb1lZDY6z3rsoeuf6sxSKtSZoA9CDrkgL6HBhf_8UyMMF2Ioyor4V9yTpL-DWJxJm0V5aKUo-Ydv8jU_dSjpikVNpLFgxq5_ovpWiGIcNJH2da0I8LMDnW8QcNX6OoXuQ0Qs-R0495_kw_cHKIVSiJjj7t2I_gG=w1106-h829-no" width=300>

As the name would suggest shutil give you the type of functionality you would expect to get in a shell

* [shutil docs](https://docs.python.org/3/library/shutil.html#module-shutil)

## commonly used shutil functions:
* [copytree()](https://docs.python.org/3/library/shutil.html#shutil.copytree)
* [rmtree()](https://docs.python.org/3/library/shutil.html#shutil.rmtree)
* [move()](https://docs.python.org/3/library/shutil.html#shutil.move)

**WARNING:** *you very quickly delete a lot of potentially useful data with this module... use with caution... Probably print before you delete.*

## shutil examples / demo:

In [None]:
import shutil

# defining the directory path and making sure it exists.
newDir = os.path.join('.', 'junkdir')
newDir = os.path.abspath(newDir)
newDirSubPaths = os.path.join(newDir, 'somedir', 'anotherdir', 'andAnother')
if not os.path.exists(newDirSubPaths):
    os.makedirs(newDirSubPaths)
    print(f"created a bunch of sub dirs..: {newDirSubPaths}")

# the magic of shutil to delete the entire dir
print(f'newDir is {newDir}')
if os.path.exists(newDir):
    shutil.rmtree(newDir)
    print(f"just deleted the whole tree and everything in it: {newDir}")

# Reading / Writing CSV files

<img src="https://percipient.co/wp-content/uploads/2015/02/csv-icon.png" width=150 style="background-color:white;padding:5px;">

* Writing CSV files can be tricky...
* fields may have  , characters that you are expecting to use as a delimiter
* You may want to quote text fields but not number fields
* the builtin csv module can help with these issues.

## Docs on CSV module.
* [CSV module documentation](https://docs.python.org/3/library/csv.html)
* [decent tutorial on getting started with CSV module](https://pymotw.com/2/csv/)

**Note:** *not covering in this course but you should be aware of it*
