# Working with Data & Files

## Interfacing with the OS

In [2]:
import os
import pathlib

## Plain text

[`open` docs](https://docs.python.org/3/library/functions.html#open)
- first arg: file name;
- second arg: "r": read, "w": write, add a "b" to it for binary

I recommend using the [`with` syntax](https://docs.python.org/3/reference/compound_stmts.html#index-16): it creates a context within which the file is open, and automatically closes it (and clean things up) after it, so you don't need to think about it.

In [12]:
with open("data/linux.txt", "r") as i:
    data_read = i.read()

data_read[:200]
# print(data_read)

'intro(1)                 General Commands Manual                 intro(1)\n\nNAME         top\n\n       intro - introduction to user commands\n\nDESCRIPTION         top\n\n       Section 1 of the manual descr'

In [13]:
data_read.split("\n")[:5] # no newlines

['intro(1)                 General Commands Manual                 intro(1)',
 '',
 'NAME         top',
 '',
 '       intro - introduction to user commands']

In [14]:
with open("data/linux.txt", "r") as i:
    data_readlines = i.readlines()

data_readlines[:5] # no newlines

['intro(1)                 General Commands Manual                 intro(1)\n',
 '\n',
 'NAME         top\n',
 '\n',
 '       intro - introduction to user commands\n']

## JSON

[JSON (JavaScript Object Notation)](https://www.json.org/json-en.html), nice reference there!  
[docs](https://docs.python.org/3/library/json.html)

In [24]:
import json

In [35]:
d = {
    "opening": {
        "totems": ["night", "moon", "fountain"],
        "tools": ["spoon", "megaphone", "pencil sharpener", "plastic skull"]
    },
    "die-throws": [3,5,4,4,6,1,0],
    "eyes-closed": False,
}

# json.dumps converts a Python object to a JSON-compatible string
d_json = json.dumps(d)
d_json

'{"opening": {"totems": ["night", "moon", "fountain"], "tools": ["spoon", "megaphone", "pencil sharpener", "plastic skull"]}, "die-throws": [3, 5, 4, 4, 6, 1, 0], "eyes-closed": false}'

In [None]:
# you can then write it to a file if you wish
with open("data/performance.json", "w") as o:
    o.write(d_json)

In [33]:
# you can achieve the same result with json.dump
with open("data/performance.json", "w") as o:
    json.dump(d, o)

In [34]:
with open("data/performance.json", "r") as o:
    d_reloaded = json.load(o)
d_reloaded

{'opening': {'totems': ['night', 'moon', 'fountain'],
  'tools': ['spoon', 'megaphone', 'pencil sharpener', 'plastic skull']},
 'die-throws': [3, 5, 4, 4, 6, 1, 0],
 'eyes-closed': False}

## Save binary data

In [1]:
import pickle

In [21]:
l = [10, 5, 8, 7]

with open("data/my_list.pkl", "wb") as o:
    pickle.dump(l, o)

In [19]:
with open("data/my_list.pkl", "rb") as i:
    l_reloaded = pickle.load(i)

l_reloaded

[10, 5, 8, 7]

## Scraping