In [1]:
%autosave 1

Autosaving every 1 seconds


# Equivalent vs. identical

In the contexts of Python,

- If two variables refer to the same object, they are identical
    - `a is b` would be `True`
- If two objects have the same values, they are equivalent

The "`is`" (***not*** the same as the comparison operator `==`) operator checks if two objects are identical.

## Equivalent mutable objects are not necessarily identical

- Identical-ness implies equivalent-ness
    - If `a is b` is true, `a == b` is always true
- Equivalent-ness does not imply identical-ness
    - If `a == b` is true, we cannot be certain `a is b`

In [2]:
# These two lists are equivalent (having same elements), but not identical

a = [1, 2, 3]
b = [1, 2, 3]

a is b

False

In [3]:
a == b

True

In [4]:
# Since they are not identical, changing b does not affect a

b.append(99)
a, b

([1, 2, 3], [1, 2, 3, 99])

## identical mutable objects  -- be careful of this behavior of Python!

In [5]:
a = [1, 2, 3]
b = a

In [6]:
a == b

True

In [7]:
a is b

True

Watch this behavior:

In [8]:
b.append(99)
a, b

([1, 2, 3, 99], [1, 2, 3, 99])

In fact, this behavior is useful when you do not have to make separate copies of objects. But if you are not careful, you could run into trouble.

To break this behavior, use the `.copy` member function to create a new object of the same contents

In [9]:
a = [1,2,3]
b = a.copy()     # explicit copy

b and a are two independent objects. Changing either one does not affect the other.

In [10]:
a == b

True

In [11]:
a is b

False

In [12]:
b.append(99)
b

[1, 2, 3, 99]

a is unaffected:

In [13]:
a

[1, 2, 3]

## Immutable objects

### numbers

In [14]:
a = 3
b = 3
a is b

True

In [15]:
a == b

True

### strings

In [16]:
a = 'banana'
b = 'banana'
a is b

True

In [17]:
a == b

True

### tuples

In [18]:
a = (3,5,7)
b = (3,5,7)
a == b

True

In [19]:
a is b

False

### examples

In [20]:
a = 'apple'
b = a
a = 'orange'
b

'apple'

In [21]:
a = (3,5,7)
b = a
a = (1,2,3)
b

(3, 5, 7)

Advice: if this is confusing to you, try not to write a program that depends on these behaviors!

# List - revisit

One of the most important Python object types

- list is mutable
- Most (not all) list methods modifies the list and returns `None`

What's wrong with this statement:

In [22]:
a = [1,2,3,4]
a = a.append(5)
# a = ?

What's wrong with this function?

In [23]:
def myfunc1(a):
    a = a.append(-99)
    return a

a = [1,2,3,4]
b = myfunc1(a)
# b = ?

## Learn how to read the documentation

i.e. get familiar with the language and style used in Python documentation

In [24]:
a = []
help(a.append)

Help on built-in function append:

append(...) method of builtins.list instance
    L.append(object) -> None -- append object to end



## Deleting elements in a list

There are different ways:

1. del
2. pop(index)
3. remove(element)

In [25]:
a = ['a', 'b', 'c', 'd']
x = a.pop(2)

The list `a` becomes:

In [26]:
a

['a', 'b', 'd']

`x` captures the element that was popped:

In [27]:
x

'c'

By default (without specifying the index), `.pop()` removed the last element from a list:

In [28]:
a.pop()
a

['a', 'b']

Check the documentation of a function: `help()`

In [29]:
help(a.pop)

Help on built-in function pop:

pop(...) method of builtins.list instance
    L.pop([index]) -> item -- remove and return item at index (default last).
    Raises IndexError if list is empty or index is out of range.



In [30]:
# add a few elements
a.append('x')
a

['a', 'b', 'x']

In [31]:
del a[1:3]
a

['a']

In [32]:
a.remove('a')
a

[]

In [33]:
help(a.remove)

Help on built-in function remove:

remove(...) method of builtins.list instance
    L.remove(value) -> None -- remove first occurrence of value.
    Raises ValueError if the value is not present.



## Utility functions for lists

- len(list)
- max(list)
- min(list)
- sum(list)   # for numbers

In [34]:
a = [1,2,3,4,5,6,7,8,9]
sum(a), max(a), min(a), len(a)

(45, 9, 1, 9)

In [35]:
avg = sum(a) / len(a)
print(avg)

5.0


## Some examples

t = [1,2,3]

To append elements, use

- t.append(99)
- t = t + [99]

In [36]:
t = [1,2,3]
t = t + [99]   # or t.append(99)
t

[1, 2, 3, 99]

These are wrong:

```
t.append([99])     # valid syntax but probably want you want
t = t.append(99)   # destroys "t"!
t + [99]           # does not append
t = t + 99         # wrong: 99 is not a list
```

## Sorting a list

There are two ways:

- `.sort()` member function sorts the list in-place.
- `sorted()` -- created a new sorted list

### `.sort()`

A member function of a list object.

In [37]:
a = [5, -9, 3.145, 3, -18]

In [38]:
b = a.sort()

In [39]:
a

[-18, -9, 3, 3.145, 5]

In [40]:
b

Why is `b` showing nothing?

In [41]:
help(a.sort)

Help on built-in function sort:

sort(...) method of builtins.list instance
    L.sort(key=None, reverse=False) -> None -- stable sort *IN PLACE*



Use `reverse` option to change sorting order:

In [42]:
a.sort(reverse=True)
a

[5, 3.145, 3, -9, -18]

### sorted()

Return a new sorted list.

In [43]:
a = [5, -9, 3.145, 3, -18]
b = sorted(a)

In [44]:
a

[5, -9, 3.145, 3, -18]

In [45]:
b

[-18, -9, 3, 3.145, 5]

Note that `sorted()` can be applied to other containers, e.g. dictionary, i.e.

In [46]:
d = {1: 'a', 3: 'c', 2: 'b', 5: 'e'}
d

{1: 'a', 2: 'b', 3: 'c', 5: 'e'}

In [47]:
d2 = sorted(d)
d2

[1, 2, 3, 5]

In [48]:
d3 = sorted(d, reverse=True)
d3

[5, 3, 2, 1]

# String - revisit

- Strings are immutable
- Most (not all) string methods returns a new string.

## strings and lists

Some functions that convert strings to or from lists

### join elements of a list into a string

In [49]:

a = ['Today', 'Is', 'A', 'Good', 'Day']
x = ' '.join(a)    # note the space character ' '
x

'Today Is A Good Day'

In [50]:
''.join(a)

'TodayIsAGoodDay'

In [51]:
a = ['Today', 'Is', 'A', 'Good', 'Day']
y = '---'.join(a)
y

'Today---Is---A---Good---Day'

### split a string into a list

In [52]:
a = 'Today is a good day'
a = a.split()                    # by default, ' ' is the delimiter
a

['Today', 'is', 'a', 'good', 'day']

By default, `.split()` use the space character as the delimiter. One can specify the delimiter character.

In [53]:
a = 'John  Mary,Jane,LLoyd'
a.split(',')

['John  Mary', 'Jane', 'LLoyd']

In [54]:
b = 'Today--is-a--good-day'
b = b.split('--')                 # specify '--' to be the delimiter
b

['Today', 'is-a', 'good-day']

In [55]:
help(''.split)

Help on built-in function split:

split(...) method of builtins.str instance
    S.split(sep=None, maxsplit=-1) -> list of strings
    
    Return a list of the words in S, using sep as the
    delimiter string.  If maxsplit is given, at most maxsplit
    splits are done. If sep is not specified or is None, any
    whitespace string is a separator and empty strings are
    removed from the result.



# Database

- A database is a file organized for storing data
- Query  (read)
- Update (write)
- Follow a standard format
- data have types

Examples: MySQL, PostgreSQL, SQLite, etc.

SQLite is part of Python distribution, so we will use this to show examples. Using other database systems from Python is similar.

To use SQLite in Python,
```
    import sqlite3
```

## SQL basics

This is not a formal introduction of SQL; you should take a database class or read a book if you want to know more about it. 

Some working knowledge:

- database table
- row and columns
- each column has a data type
- relational databases

Basic SQL commands:
- select: query
- insert: add a row in a table
- create: create a new table
- drop: delete a table


Commonly used SQL commands:

`CREATE TABLE table_name`

`INSERT INTO table_name (column_name1, column_name2, ...) VALUES (v1,v2,...)`

`SELECT * FROM table_name WHERE <condition>`

`UPDATE table_name SET column_name=XXX WHERE <condition>`

`DELETE FROM table_name WHERE <condition>`


## Creating a database

In [56]:
import sqlite3

conn = sqlite3.connect('mydb.sqlite')
cur = conn.cursor()            # similar to a file handle, i.e. f = open(...)

cur.execute('DROP TABLE IF EXISTS mytable')                      # SQL command
cur.execute('CREATE TABLE mytable (name TEXT, score INTEGER)')   # SQL command

# insert some records

cur.execute('INSERT INTO mytable (name, score) VALUES (?,?)', ('John', 60))
cur.execute('INSERT INTO mytable (name, score) VALUES (?,?)', ('Jane', 70))
cur.execute('INSERT INTO mytable (name, score) VALUES (?,?)', ('Jim', 90))

conn.commit()       # don't forget to commit!
conn.close()

Batch processing - prepare the data in a list of tuples, then call `executemany`:

In [57]:
import sqlite3

conn = sqlite3.connect('mydb.sqlite')
cur = conn.cursor()            # similar to a file handle, i.e. f = open(...)

cur.execute('DROP TABLE IF EXISTS mytable')                      # SQL command
cur.execute('CREATE TABLE mytable (name TEXT, score INTEGER)')   # SQL command

d = [('John', 66), ('Jane', 77), ('Jim', 99)]

cur.executemany('INSERT INTO mytable (name, score) VALUES (?,?)', d)

conn.commit()       # don't forget to commit!
conn.close()

In [58]:
# check db
! sqlite3 mydb.sqlite "select * from mytable"

John|66
Jane|77
Jim|99


## query a table

In [59]:
import sqlite3

conn = sqlite3.connect('mydb.sqlite')
cur = conn.cursor()

cur.execute('SELECT * from mytable')
for row in cur:
    print(row)

('John', 66)
('Jane', 77)
('Jim', 99)


In [60]:
cur.execute('SELECT * from mytable where score > 80')
for row in cur:
    print(row)

('Jim', 99)


In [61]:
cur.execute('SELECT * from mytable where name = "John"')
for row in cur:
    print(row)

('John', 66)


## Update a row

In [62]:
# update the row where name = "John"

cur.execute('UPDATE mytable SET score=99 where name = "John"')
conn.commit()

cur.execute('SELECT * from mytable where name = "John"')
for row in cur:
    print(row)

('John', 99)


## Delete a row

In [63]:
# delete the "John" row

cur.execute('DELETE FROM mytable WHERE name = "John"')
conn.commit()

cur.execute('SELECT * from mytable')
for row in cur:
    print(row)

('Jane', 77)
('Jim', 99)


# sqlite -- accessing by column names


In [64]:
import sqlite3

conn = sqlite3.connect('data/mydb.sqlite')

conn.row_factory = sqlite3.Row        # enable access by column names

cur = conn.execute('SELECT * from mytable')
for row in cur:
    print('{:<10s} {:>10s}'.format(row['name'], row['phone']))

conn.close()

John          123-456
Jane          246-000
Jim           333-333


## Capture the query results in a list -- `fetchall()`

In [65]:
import sqlite3

conn = sqlite3.connect('data/mydb.sqlite')
cur = conn.execute('SELECT * from mytable')

x = cur.fetchall()

conn.close()
print(x)

[('John', '101 Westwood, Los Angeles', '123-456'), ('Jane', '95 Hollywood, Los Angeles', '246-000'), ('Jim', '88 Pico Blvd, Los Angeles', '333-333')]


In [66]:
for t in x:
    print('{:<15}  {}'.format(t[0],t[1]))

John             101 Westwood, Los Angeles
Jane             95 Hollywood, Los Angeles
Jim              88 Pico Blvd, Los Angeles


## A word about cursor buffer

The cursor buffer is emptied after access

In [67]:
import sqlite3

conn = sqlite3.connect('data/mydb.sqlite')
cur = conn.execute('SELECT * from mytable')
x = cur.fetchall()

print("after fetchall, x =", x)

# At this point, the cursor's buffer has been emptied (by the "fetch")
# fetch again

x2 = cur.fetchall()
print('fetch again, x2 =',x2)

conn.close()

after fetchall, x = [('John', '101 Westwood, Los Angeles', '123-456'), ('Jane', '95 Hollywood, Los Angeles', '246-000'), ('Jim', '88 Pico Blvd, Los Angeles', '333-333')]
fetch again, x2 = []


# Try... except...

Capture expected/unexpected errors and prevent crash

In [68]:
! rm -f db1.sqlite
# example of inserting duplicate items

import sqlite3
conn = sqlite3.connect('db1.sqlite')
cur = conn.cursor()
cur = conn.execute('DROP TABLE IF EXISTS mytable')
cur.execute('CREATE TABLE mytable (name TEXT UNIQUE)')    # note the "UNIQUE" constraint

for x in ['John', 'Mary', 'Jane'] + ['Jack', 'Lily']:
#for x in ['John', 'Mary', 'Jane'] + ['John'] + ['Jack', 'Lily']:   # duplicate item causes error!

    cur.execute('INSERT INTO mytable (name) VALUES (?)', (x,))
    
conn.commit()

! sqlite3 db1.sqlite "select * from mytable"

John
Mary
Jane
Jack
Lily


In [69]:
!rm -f db1.sqlite

conn = sqlite3.connect('db1.sqlite')
conn = sqlite3.connect('db1.sqlite')
cur = conn.cursor()
cur = conn.execute('DROP TABLE IF EXISTS mytable')
cur.execute('CREATE TABLE mytable (name TEXT UNIQUE)') 

for x in ['John', 'Mary', 'Jane'] + ['John'] + ['Jack', 'Lily']:   # contains duplicate item

    try:      # protect the possible error by try-except
        cur.execute('INSERT INTO mytable (name) VALUES (?)', (x,))
    except:
        print("There is an insert error... ignore x = ", x)
        continue
    
conn.commit()
conn.close()

! sqlite3 db1.sqlite "select * from mytable"

There is an insert error... ignore x =  John
John
Mary
Jane
Jack
Lily


- When the "try" action fails, the control falls into "except".
- Try-except has many other uses

## Since you cannot guarantee the run time environment, it is a good idea to protect the code from possible crash by using try/except.

Another common example - trying to open a file that does not exist:

In [70]:
# What if the file does not exist

print('program starts...')

#f = open('myfile.txt', 'r')
#x = f.reads()
#f.close()

print('program continues...')

program starts...
program continues...


In [71]:
# code protected by try-except:

print('program starts...')

try:
    f = open('myfile.txt', 'r')
    x = f.reads()
    f.close()
except:
    print('file open error')
    pass

print('program continues...')

program starts...
file open error
program continues...


# XML parsing example



XML example

```xml
<person>
  <name>John</name>
  <phone type="intl">
    +1 310 123 4567
  </phone>
  <email hide="yes"/>
</person>

```

For more information about XML, see, e.g. https://en.wikipedia.org/wiki/XML and other references

In [72]:
import xml.etree.ElementTree as ET

with open('xml.xml', 'r') as f:
    data = f.read()
    
tree = ET.fromstring(data)    # read in XML as a tree structure

print('Name = ', tree.find('name').text )           # get a value
print('Attr = ', tree.find('email').get('hide') )   # get an attribute

Name =  John
Attr =  yes


## Accessing multiple nodes in XML

In [73]:
with open('xml2.xml', 'r') as f:
    data = f.read()

import xml.etree.ElementTree as ET

myxml = ET.fromstring(data)
ulist = myxml.findall('users/user')   # find all user under "users"

for x in ulist:
    print('Name = ', x.find('name').text  )
    print('id   = ', x.find('id').text  )

Name =  Marie
id   =  023
Name =  Bret
id   =  023


# JSON 

Another popular format for data exchange.

```json
{
    "glossary": {
        "title": "example glossary",
        "GlossDiv": {
            "title": "S",
            "GlossList": {
                "GlossEntry": {
                    "ID": "SGML",
                    "SortAs": "SGML",
                    "GlossTerm": "Standard Generalized Markup Language",
                    "Acronym": "SGML",
                    "Abbrev": "ISO 8879:1986",
                    "GlossDef": {
                        "para": "A meta-markup language, used to create markup languages such as DocBook.",
                        "GlossSeeAlso": ["GML", "XML"]
                    },
                    "GlossSee": "markup"
                }
            }
        }
    }
}
```

For more info, see,  http://json.org

In [74]:
import json

with open('json.json', 'r') as f:
    data = f.read()

info = json.loads(data) # load a string in JSON format into a Python object
print(info)

[{'id': '015', 'x': '6', 'name': 'John'}, {'id': '023', 'x': '2', 'name': 'Marie'}]


In [75]:
for item in info:
    print(item['name'])

John
Marie


## Google API example

In [76]:
import json
import urllib.request

u = urllib.request.urlopen('https://maps.googleapis.com/maps/api/geocode/json?' \
        + 'address=10920+Wilshire+Blvd,+Los+Angeles,+CA')
X = u.read().decode()
data = json.loads(X)

In [77]:
data

{'results': [{'address_components': [{'long_name': '10920',
     'short_name': '10920',
     'types': ['street_number']},
    {'long_name': 'Wilshire Boulevard',
     'short_name': 'Wilshire Blvd',
     'types': ['route']},
    {'long_name': 'Westwood',
     'short_name': 'Westwood',
     'types': ['neighborhood', 'political']},
    {'long_name': 'Los Angeles',
     'short_name': 'Los Angeles',
     'types': ['locality', 'political']},
    {'long_name': 'Los Angeles County',
     'short_name': 'Los Angeles County',
     'types': ['administrative_area_level_2', 'political']},
    {'long_name': 'California',
     'short_name': 'CA',
     'types': ['administrative_area_level_1', 'political']},
    {'long_name': 'United States',
     'short_name': 'US',
     'types': ['country', 'political']},
    {'long_name': '90024', 'short_name': '90024', 'types': ['postal_code']}],
   'formatted_address': '10920 Wilshire Blvd, Los Angeles, CA 90024, USA',
   'geometry': {'bounds': {'northeast': {'lat'

In [78]:
data['results'][0]

{'address_components': [{'long_name': '10920',
   'short_name': '10920',
   'types': ['street_number']},
  {'long_name': 'Wilshire Boulevard',
   'short_name': 'Wilshire Blvd',
   'types': ['route']},
  {'long_name': 'Westwood',
   'short_name': 'Westwood',
   'types': ['neighborhood', 'political']},
  {'long_name': 'Los Angeles',
   'short_name': 'Los Angeles',
   'types': ['locality', 'political']},
  {'long_name': 'Los Angeles County',
   'short_name': 'Los Angeles County',
   'types': ['administrative_area_level_2', 'political']},
  {'long_name': 'California',
   'short_name': 'CA',
   'types': ['administrative_area_level_1', 'political']},
  {'long_name': 'United States',
   'short_name': 'US',
   'types': ['country', 'political']},
  {'long_name': '90024', 'short_name': '90024', 'types': ['postal_code']}],
 'formatted_address': '10920 Wilshire Blvd, Los Angeles, CA 90024, USA',
 'geometry': {'bounds': {'northeast': {'lat': 34.0584039, 'lng': -118.4443457},
   'southwest': {'lat':

In [79]:
data['results'][0]['geometry']['location']['lat']

34.0581554

# File and path

The `os` module has a lot of useful tools.

```python
import os
```

- Linux and Mac use the UNIX-style path, e.g. `/Users/your_name/Desktop`
- Windows use a different path, e.g. `C:\\Users\\your_name\\Desktop`

Python program can be made cross-platform (same code runs on both Windows and UNIX), if you use the correct function calls to construct path names. 

Using the `/` (on UNIX) or `\\` (on Windows) in path names makes your Python code non-cross-platform.


In [80]:
# the current dirctory

import os
cwd = os.getcwd()
print("current directory = ", cwd)

current directory =  /Users/schuang/git/python-lecture-notes/module5


In [81]:
# list files

os.listdir(cwd)

['.ipynb_checkpoints',
 'data',
 'db1.sqlite',
 'json.json',
 'module5.ipynb',
 'mydb.sqlite',
 'xml.xml',
 'xml2.xml']

## join path and file name

In [82]:
## home directory

homedir = os.path.expanduser('~')    # works on both UNIX and Windows

In [83]:
root = os.path.expanduser('~')
dir = 'Desktop'
os.path.join(root, dir)    # on Windows, the path will automatically use `\\`

'/Users/schuang/Desktop'

In [84]:
# check if a file is a file
homedir =  os.path.expanduser('~')   
f = os.path.join(homedir, 'git', 'pyclass-2017')   # this is a directory
os.path.isfile(f) 

False

In [85]:
f = os.path.join(homedir, 'git', 'pyclass-2017', 'info.tex')  # this is a file
os.path.isfile(f)  

True

In [86]:
# [o] if a file, [x] if not

cwd = os.path.join(homedir, 'git', 'pyclass-2017')
count_f = 0
count_d = 0
for x in os.listdir(cwd):
    if os.path.isfile(x):
        print('{:<40} {:.>20}'.format(x, '[o]'))
        count_f += 1              # same as count = count + 1
    else:
        print('{:<40} {:.>20}'.format(x, '[x]'))
        count_d += 1
print("Total: {} files, {} non-files".format(count_f, count_d))

.git                                     .................[x]
.gitignore                               .................[x]
.ipynb_checkpoints                       .................[x]
__pycache__                              .................[x]
addr_to_img.py                           .................[x]
data                                     .................[x]
db1.sqlite                               .................[o]
fig                                      .................[x]
foo.json                                 .................[x]
hw1                                      .................[x]
hw2                                      .................[x]
hw3                                      .................[x]
hw4                                      .................[x]
HW4solutions.ipynb                       .................[x]
info.pdf                                 .................[x]
info.tex                                 .................[x]
ini.conf

# Traverse (walk) a directory

`os.walk()` returns the (dirpath, dirnames, filenames) tuples of all (sub)directories, top-to-bottom by default.

- dirpath : the path to the directory
- dirnames: a list of subdirectories in dirpath
- filenames: a list of non-directory files in dirpath

In [87]:
import os
for (dirname, dirs, files) in os.walk('/tmp'):
    print(dirname, dirs, files)

/tmp ['.vbox-schuang-ipc', 'Atom Crashes', 'boost_interprocess', 'com.apple.launchd.3bMDh79Ui2', 'com.apple.launchd.4UClYwsQIj', 'com.apple.launchd.4wItX14oyb', 'com.apple.launchd.6BJXlVqVmx', 'com.apple.launchd.7yQtqlAWSo', 'com.apple.launchd.8FZXXqp3cQ', 'com.apple.launchd.adomKZoVz6', 'com.apple.launchd.bQKr4cJ9av', 'com.apple.launchd.CoZJTDYX7l', 'com.apple.launchd.cVxAPyamfP', 'com.apple.launchd.dJiergRV3b', 'com.apple.launchd.guNT0Zqs7R', 'com.apple.launchd.hiQHmgNkPn', 'com.apple.launchd.IFVguN0HFp', 'com.apple.launchd.JLotExl3lY', 'com.apple.launchd.kpbZvYmIuV', 'com.apple.launchd.LA1LhW0m6g', 'com.apple.launchd.MlcAD8maaE', 'com.apple.launchd.NAYjgsYPoV', 'com.apple.launchd.pV0HO5ZLJb', 'com.apple.launchd.tEx63Z6cxl', 'com.apple.launchd.vUM0HFacuN', 'com.apple.launchd.WzsUtg9nKE', 'com.apple.launchd.YAQy0iQN7b', 'KSDownloadAction.nnuLpgH6OF', 'KSDownloadAction.y3BQDrLC11'] ['.keystone_install_lock', 'adobegc.log', 'adobesmuoutpEvScDf', 'adobesmuoutpHBL5g4', 'adobesmuoutpn2u3GB

In [88]:
# create full path from dir and filename using os.path.join

import os

cwd = os.getcwd()
fname = os.listdir(cwd)[0]
print("cwd = {}\nfname = {}\nfull path = {}".format(cwd, fname, os.path.join(cwd, fname)  ))

cwd = /Users/schuang/git/python-lecture-notes/module5
fname = .ipynb_checkpoints
full path = /Users/schuang/git/python-lecture-notes/module5/.ipynb_checkpoints


## file size

In [89]:
import os
cwd = os.getcwd()
f = os.path.join(cwd, 'xml.xml')
os.path.getsize(f)   # size in bytes

111

In [90]:
# check the non-directory file size of all subdirectories, skipping the .git directory

import os
from os.path import join, getsize

cwd = os.getcwd()
for root, dirs, files in os.walk(cwd):
    
    total_size = 0
    for name in files:
        fullpath = os.path.join(root, name)
        total_size += os.path.getsize(fullpath)

    print("{:<55}{:>3} non-dir files {:>10} bytes".format(root, len(files), total_size))

    if '.git' in dirs:       # skip the .git directory
        dirs.remove('.git')

/Users/schuang/git/python-lecture-notes/module5          6 non-dir files      71310 bytes
/Users/schuang/git/python-lecture-notes/module5/.ipynb_checkpoints  1 non-dir files      50560 bytes
/Users/schuang/git/python-lecture-notes/module5/data     1 non-dir files       8192 bytes


## file time stamp

- ctime: creation time
- mtime: modification time
- atime: access time

In [91]:
import os
fn = 'json.json'
os.path.getatime(fn), os.path.getctime(fn), os.path.getmtime(fn)

(1508373704.0, 1507162208.0, 1507162208.0)

In [92]:
import datetime
datetime.datetime.fromtimestamp(1501106386.011095)

datetime.datetime(2017, 7, 26, 14, 59, 46, 11095)

In [93]:
datetime.datetime.fromtimestamp(1501106386.011095).strftime('%Y-%m-%d %H:%M')

'2017-07-26 14:59'

In [94]:
os.stat(fn)

os.stat_result(st_mode=33188, st_ino=19659054, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=138, st_atime=1508373704, st_mtime=1507162208, st_ctime=1507162208)

In [95]:
os.stat(fn).st_atime

1508373704.0

## Changing directory

In [96]:
import os
cwd = os.getcwd()
print('current directory = {}'.format(cwd))
print(os.listdir(cwd)[0:5])

current directory = /Users/schuang/git/python-lecture-notes/module5
['.ipynb_checkpoints', 'data', 'db1.sqlite', 'json.json', 'module5.ipynb']


In [97]:
# change the current working directory

os.chdir('/tmp')
cwd = os.getcwd()
print('current directory = {}'.format(cwd))
print(os.listdir(cwd))

current directory = /private/tmp
['.keystone_install_lock', '.vbox-schuang-ipc', 'adobegc.log', 'adobesmuoutpEvScDf', 'adobesmuoutpHBL5g4', 'adobesmuoutpn2u3GB', 'adobesmuoutprQwPTq', 'Atom Crashes', 'boost_interprocess', 'com.adobe.acrobat.rna.0.1f5', 'com.adobe.acrobat.rna.1381.1f5', 'com.adobe.acrobat.rna.28be.1f5', 'com.adobe.acrobat.rna.3940.1f5', 'com.adobe.acrobat.rna.3d21.1f5', 'com.adobe.acrobat.rna.AcroCefBrowserLock', 'com.apple.launchd.3bMDh79Ui2', 'com.apple.launchd.4UClYwsQIj', 'com.apple.launchd.4wItX14oyb', 'com.apple.launchd.6BJXlVqVmx', 'com.apple.launchd.7yQtqlAWSo', 'com.apple.launchd.8FZXXqp3cQ', 'com.apple.launchd.adomKZoVz6', 'com.apple.launchd.bQKr4cJ9av', 'com.apple.launchd.CoZJTDYX7l', 'com.apple.launchd.cVxAPyamfP', 'com.apple.launchd.dJiergRV3b', 'com.apple.launchd.guNT0Zqs7R', 'com.apple.launchd.hiQHmgNkPn', 'com.apple.launchd.IFVguN0HFp', 'com.apple.launchd.JLotExl3lY', 'com.apple.launchd.kpbZvYmIuV', 'com.apple.launchd.LA1LhW0m6g', 'com.apple.launchd.MlcA

## Home directory

In [98]:
os.path.expanduser('~')

'/Users/schuang'

In [99]:
# full path

os.chdir(os.path.join(os.path.expanduser('~'), 'git/pyclass-2017'))
os.getcwd()

'/Users/schuang/git/pyclass-2017'