<img src="https://www.python.org/static/community_logos/python-powered-w-200x80.png" style="float: left; margin: 20px; height: 55px">

# Python Basics - File System

_Author: Alfred Zou_

---

## Opening Files
* Most of the time our data will be in another file
* We will need to somehow tell Python to open the file, then extract the data
* We've already talked about opening files in the classes notebook regarding context managers, but I want to expand on the functionality of `open()`
* There's a lot of flags that can be used:
* `r`: read
* `b`: binary
* `w`: write. Will overwrite any existing text. Will create a file if it doesn't exist already
* `a`: append. Add to the end of the file as an extra line. Will create a file if it doesn't exist already
* Other arguments:
* `encoding`: specify the encoding the file to be read/written in
* `` :

```python
with open(my_file_path,flags) as f:
    f.my_method()
```

##### Writing to a file
* `f.write()` for `w` and `a`

In [68]:
with open('Scripts/test.txt','w') as f:
    f.write('line 1\n')
    f.write('line 2\n')
    f.write('line 3\n')

##### Retrieving values from a file
* `f.readlines()` to retrieve all the lines of a file in a list format

In [69]:
with open('Scripts/test.txt','r') as f:
    rows = f.readlines()
rows

['line 1\n', 'line 2\n', 'line 3\n']

##### Separating Column Headings from Data
* Sometimes you might want to split off the header in case of a csv

In [89]:
with open('Data/au-500.csv','r') as f:
    header = next(f)
    data = f.readlines()[:5]
print(f'header: {header}')
print(f'data: {data}')

header: "first_name","last_name","company_name","address","city","state","post","phone1","phone2","email","web"

data: ['"Rebbecca","Didio","Brandt, Jonathan F Esq","171 E 24th St","Leith","TAS",7315,"03-8174-9123","0458-665-290","rebbecca.didio@didio.com.au","http://www.brandtjonathanfesq.com.au"\n', '"Stevie","Hallo","Landrum Temporary Services","22222 Acoma St","Proston","QLD",4613,"07-9997-3366","0497-622-620","stevie.hallo@hotmail.com","http://www.landrumtemporaryservices.com.au"\n', '"Mariko","Stayer","Inabinet, Macre Esq","534 Schoenborn St #51","Hamel","WA",6215,"08-5558-9019","0427-885-282","mariko_stayer@hotmail.com","http://www.inabinetmacreesq.com.au"\n', '"Gerardo","Woodka","Morris Downing & Sherred","69206 Jackson Ave","Talmalmo","NSW",2640,"02-6044-4682","0443-795-912","gerardo_woodka@hotmail.com","http://www.morrisdowningsherred.com.au"\n', '"Mayra","Bena","Buelt, David L Esq","808 Glen Cove Ave","Lane Cove","NSW",1595,"02-1455-6085","0453-666-885","mayra.bena@gmail.com

##### Reading a file
* `next(f)` to get the next line, open returns rows like a generator
* For reading a file, you can loop over the generated rows. This is my preferred approach

In [73]:
with open('Scripts/test.txt','r') as f:
    print(next(f),end='')
    print(next(f),end='')
    print(next(f),end='')

line 1
line 2
line 3


In [85]:
with open('Scripts/test.txt','r') as f:
    for row in f:
        print(row,end='')

line 1
line 2
line 3


##### Appending to a file

In [86]:
with open('Scripts/test.txt','a') as f:
    f.write('line 4\n')

In [87]:
# To confirm appending of data
with open('Scripts/test.txt','r') as f:
    rows = f.read()
rows

'line 1\nline 2\nline 3\nline 4\n'

## Navigating the File System using Python
* Sometimes it might be useful to interface with our computer's file system
* This is probably easier to do using bash (refer to note book on the command line interface), as Python's method through the `os` module is more verbose
* However, Bash is quite limited in its ability as a programming language
* If you need to use Python and interface with your computer's file system, its probably easier to just only use Python
* Rather than attempt to integrate both Python and Bash somehow

In [1]:
import os

##### get current working directory (cwd)

In [2]:
os.getcwd()

'C:\\Users\\draciel\\Dropbox\\General_Assembly\\Github\\Notes'

##### list cwd

In [3]:
print(os.listdir())

['.git', '.gitignore', '.ipynb_checkpoints', '0.1 Intro to Data & Command Line.ipynb', '0.2 Git & GitHub.ipynb', '0.3 Nano & Vim.ipynb', '0.5 Python & Jupyter.ipynb', '1.0 Python Basics - Data Types.ipynb', '1.1 Python Basics - Control Flow.ipynb', '1.2 Python Basics - Functions.ipynb', '1.3 Python Basics - Classes.ipynb', '1.4 Python Basics - File System.ipynb', '1.5 Python Basics - Scripting.ipynb', '1.9 Python Basics - Misc.ipynb', '5.0 Data Visualisation.ipynb', '6.0 NumPy.ipynb', '7.0 Pandas.ipynb', '8.0 Regular Expressions.ipynb', '9.0 Relational Databases and SQL.ipynb', 'chromedriver', 'code-vault', 'Data', 'demo.txt', 'github test.ipynb', 'IgnoredDirectory', 'README.md', 'Scripts', 'Web Scraping & APIs.ipynb', '__pycache__']


##### changing directory (relative path)

In [4]:
os.chdir('Scripts')
print(os.getcwd())
print(os.listdir())

C:\Users\draciel\Dropbox\General_Assembly\Github\Notes\Scripts
['math_operations.py', 'Test Folder', 'test.py', 'test.txt', '__pycache__']


##### Creating folders

In [7]:
os.mkdir('Test Folder')
os.listdir()

['math_operations.py', 'Test Folder', 'test.py', 'test.txt', '__pycache__']

##### changing directory (absolute path)

In [8]:
existing_path = os.getcwd()
new_path = existing_path + '\\Test Folder'
print(new_path)

C:\Users\draciel\Dropbox\General_Assembly\Github\Notes\Scripts\Test Folder


In [9]:
os.chdir(new_path)
os.getcwd()

'C:\\Users\\draciel\\Dropbox\\General_Assembly\\Github\\Notes\\Scripts\\Test Folder'

##### Creating Files

In [15]:
with open('test_file.csv','w') as f:
    pass
print(os.listdir())

['test_file.csv']


##### Removing Files

In [16]:
os.remove('test_file.csv')
print(os.listdir())

[]


##### Removing folders

In [18]:
os.chdir('..')
os.rmdir('Test Folder')
os.listdir()

['math_operations.py', 'test.py', 'test.txt', '__pycache__']