# Subsets of elements
* For lists and other indexable elements, you can easily find subsets:

In [18]:
x = range(1, 20)
y = "The quick brown fox jumps over the lazy dog."
# print elements 2 to 4
print(x[2:5])
print(y[2:5])
# Print every third element starting from the first element
print(x[::3])
print(y[::3])

# Print the second last element
print(x[-2])
print(y[-2])

range(3, 6)
e q
range(1, 20, 3)
T i o xusv ea g
18
g


# Libraries

* Python comes with batteries included.
* Several commonly used things are available in standard libraries.
* Size and breadth are large
* Not all of them have the same quality, but most of them are great.
* Hundreds of modules tested, ready to use.

# Standard Libraries

* Text processing `string`, `re`, `textwrap`
* Numbers `decimal`, `random`
* File system `os`, `glob`, `shutil`, `stringIO`
* Building Applications `argparse`
* Runtime features `sys`
* Several others are there including networks, cryptography etc.

# String
* String utilities for a lot of common things
* capitalize, capitalize first letters etc.
* Have to _import_ `string` module
* Some examples:

In [19]:
import string
s = 'the quick brown fox jumps over the lazy dog.'
s.capitalize()

'The quick brown fox jumps over the lazy dog.'

In [26]:
s = "The quick brown fox jumps over the lazy dog."
string.capwords(s)

'The Quick Brown Fox Jumps Over The Lazy Dog.'

# Example: Phonecode
* Convert letters to numbers as seen on phones
* Phone letters: 2 → ABC, 3 → DEF … 9 → WXYZ
* For instance, BUYME is 28963

In [27]:
phonecode_lower = str.maketrans(string.ascii_lowercase, '22233344455566677778889999')
phonecode_upper = str.maketrans(string.ascii_uppercase, '22233344455566677778889999')
s.translate(phonecode_lower)

'T43 78425 27696 369 58677 6837 843 5299 364.'

## Exercise
How would you convert both upper and lowercase to phonecode?

# textwrap
* Useful for pretty printing paragraphs
* Wrapping is provided automatically

In [36]:
import textwrap
sample_text="This is a really long rambling sentence with no end in sight. This sentence is going to put people to sleep"
textwrap.fill(sample_text,width=10)
print(textwrap.fill(sample_text,width=10))
print()
print(textwrap.fill(sample_text,width=50))

This is a
really
long
rambling
sentence
with no
end in
sight.
This
sentence
is going
to put
people to
sleep

This is a really long rambling sentence with no
end in sight. This sentence is going to put people
to sleep


# Decimal
* Module for arbitrary precision arithmetic of floating point

In [37]:
print(0.1-0.00000001)
print(0.1-0.000000001)
print(0.1-0.0000000001)

0.09999999000000001
0.099999999
0.09999999990000001


# With decimal

In [51]:
import decimal
a=(0,(1,0),-2)
b=(0,(1,0),-9)
type(a)

tuple

* Construct a tuple with sign, digits and exponent

In [52]:
ad = decimal.Decimal(a)
bd = decimal.Decimal(b)
type(ad)
ad - bd

Decimal('0.099999990')

# Contexts, Setting precision

In [55]:
from decimal import *
print(getcontext())
getcontext().prec=20
cd = decimal.Decimal((0,(3,0),-1))
print(1/cd)
getcontext().prec=40
print(1/cd)
print(1/cd+1/cd)
print(1/cd+1/cd*3)
print(1/cd+1)
#print(1/cd+1.0)

Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[Inexact, Rounded], traps=[InvalidOperation, DivisionByZero, Overflow])
0.33333333333333333333
0.3333333333333333333333333333333333333333
0.6666666666666666666666666666666666666666
1.333333333333333333333333333333333333333
1.333333333333333333333333333333333333333


# Setting precision locally

In [56]:
with decimal.localcontext() as c:
   c.prec=2
   print(decimal.Decimal('3.1415')/3)
   print('Local Precision:', c.prec)
print('Default precision:',decimal.getcontext().prec)
print(decimal.Decimal('3.1415')/3)

1.0
Local Precision: 2
Default precision: 40
1.047166666666666666666666666666666666667


* `c` has a local context. Changes to `c` is not propagated to rest of code.
* Can set several contexts within same code

# Per-instance context

* Variables can also carry contexts with them.
* Operations with the variable will have that context
* Mixing two contexts, with two different precisions, will give higher precision of the two

In [57]:
import decimal
c = decimal.getcontext().copy()
c.prec = 3
pi = c.create_decimal('3.1415')
print('PI:', pi)
print('RESULT:', decimal.Decimal('2.01') * pi)

PI: 3.14
RESULT: 6.3114


# Fractions
* See fraction module also
* Numerator, denominator combination
* All operations are done with numerator, denominator
* Can convert fractions to floats and vice versa

# Random

In [78]:
import random
for i in range(5):
    print(random.random())

0.375986267407762
0.9822428719391535
0.38350504617783177
0.5204510064900055
0.9695754537075303


* Repeat once more

In [79]:
for i in range(5):
    print(random.random())

0.3826451857969857
0.0925560235163857
0.6164335402984279
0.9277889298403925
0.6115379823420043


* You get different values
* For uniform random numbers:

In [76]:
for i in range(5):
    print(random.uniform(1,100))

66.17157363319582
72.20459802681887
10.729644905406404
22.906417534384897
58.7214199147031


# Setting seeds
* Sometimes you want to repeat experiments
* Set a seed and use it for debug/design phase
* Remove seed and deploy

In [86]:
random.seed(1)
for i in range(5):
   print(random.uniform(1,100))

14.30206016712772
84.89593995678604
76.6136872786848
26.251833548202747
50.04807362210215


* Run it again?

# Random Integers

In [102]:
for i in range(3):
    print(random.randint(1, 100))
print('\n[-5, 5]:', end="")
for i in range(3):
    print(random.randint(-5, 5))
print()

63
94
4

[-5, 5]:2
-5
-1



# Random elements from a range

In [107]:
import random
for i in range(3):
    print(random.randrange(0, 101, 5))

85
35
60


* randrange supports a step argument
* More efficient because range is not fully constructed

# Picking random items

In [109]:
import random
import itertools
outcomes = { 'heads':0, 'tails':0,}
sides = list(outcomes.keys())

for i in range(10000):
     outcomes[ random.choice(sides) ] += 1
print('Heads:', outcomes['heads'])
print('Tails:', outcomes['tails'])

Heads: 4980
Tails: 5020


* `choice()` selects randomly from a sequence

# Sampling
* Sampling from a population
* sample() function generates samples without repeating values
* Does not modify input sequence

In [114]:
import random
with open('/usr/share/dict/words', 'rt') as f:
   words = f.readlines()
words = [ w.rstrip() for w in words ]
for w in random.sample(words, 5):
    print(w)

mismanage
Alexander
unbeknown
poets
fatheads


# math library
* `math` has several mathematical functions
* whatever you expect from a math library
* Take a look!


# File System
* Python has several tools to work with files
* Common operations like looking at file contents, building filenames, parsing file names, traversing directories
* Filenames are represented as simple strings
* Strings can be built from platform independent components in os.path
  - For example, to deal with / in Unix vs. \ in Windows

# os.path
* Platform independent operations for file manipulation
* Even if you are not planning to port your application to another platform, use `os.path`


# Parsing Paths
* **os.sep** —The separator between portions of the path (e.g., “ / ” or “ \ ”).
* **os.extsep** —The separator between a filename and the file “extension” (e.g.,“ . ”).
* **os.pardir** —The path component that means traverse the directory tree up one level (e.g., “ .. ”).
* **os.curdir** —The path component that refers to the current directory (e.g., “ . ”).

# os.path.split
* `split()` breaks the path into two separate parts
* returns a tuple

In [56]:
import os.path
for path in [ '/one/two/three',  '/one/two/three/',
    '/',  '.',
    '']:
        print('%15s : %s' % (path, os.path.split(path)))

 /one/two/three : ('/one/two', 'three')
/one/two/three/ : ('/one/two/three', '')
              / : ('/', '')
              . : ('', '.')
                : ('', '')


# os.path.splitext and os.path.commonprefix
* os.path.splitext() splits on ext
* gives filename and extension separately
* os.path.commonprefix() gives common prefix of paths passed as argument
* Example:

In [57]:
paths=["/one/two/three","/one/two", "/one/two/three/a.txt"]
os.path.commonprefix(paths)

'/one/two'

# Building Paths
* To combine several path components, use `join()`
* If any argument to `join()` begins with `os.sep`, all previous arguments are discarded
* New one becomes the beginning of return value

In [59]:
import os.path
for parts in [ ('one', 'two', 'three'), ('/', 'one', 'two', 'three'),
               ('/one', '/two', '/three'),
]:
    print(parts, ':', os.path.join(*parts))

('one', 'two', 'three') : one/two/three
('/', 'one', 'two', 'three') : /one/two/three
('/one', '/two', '/three') : /three


# Normalizing Paths

* Strings from different joins can end up with - `/one/two//three/four` - `/one/./two/three/./four`
* Normalization removes these cases
* `os.path.normpath(path)`
* Try on
  - `one/../two/three`
  - `one/./two/three`
  - `one//two//./three`

# Traversing a Directory
* `os.walk(top, func, arg)`
* Walks from the top directory to each of the subdirectories

In [118]:
import os
if not os.path.exists('example'):
    os.mkdir('example')
if not os.path.exists('example/one'):
    os.mkdir('example/one')
with open('example/one/file.txt', 'wt') as f:
    f.write('contents')
with open('example/two.txt', 'wt') as f:
    f.write('contents')
for root,dirs,files in os.walk('example'):
    print(root, dirs, files)

example ['one'] ['two.txt']
example/one [] ['file2.txt', 'file.txt']


# glob
* To find filenames which match a pattern
* Packs a lot of power
* Can use to find files that have a certain extension, prefix or some pattern in the middle
* **DO NOT** write custom code

In [124]:
import glob
for name in glob.glob('example/*'):
     print(name)

example/two.txt
example/one


* Prints all the contents of `example` directory
* Does not recurse
* To list contents of a subdirectory, pattern must be listed

In [74]:
for name in glob.glob('example/*w*.txt'):
    print(name)

example/two.txt


# shutil
* Useful for high level file operations
  - Copy files
  - Set permissions etc.
* `import shutil`
* Some important ones:
  - `copy()`, `copy2()`, `copytree()`, `rmtree()`, `stat()`, `make_archive()`

# sys
* Probe or change the configuration of the Python interpreter
* Interact with the environment

In [76]:
import sys

print(sys.version)
print(sys.platform)
print(sys.flags)

3.5.2+ (default, Aug  5 2016, 08:07:14) 
[GCC 6.1.1 20160724]
linux


# Runtime Environment with sys
* `sys.argv`
* Example:

In [125]:
import sys
print(sys.argv)

['/home/kumar/.local/lib/python3.5/site-packages/ipykernel/__main__.py', '-f', '/run/user/1000/jupyter/kernel-96e3a7ca-f2a2-49d0-bd4d-d3b0a0dfd132.json']


* Save the contents in "sysargs.py"
* Then run
  - `python sysargs.py`
  - `python sysargs.py -l`
  - `python sysargs -l -a`

# Input and Output Streams
* Following Unix philosophy, can access stdin, stdout and stderr

In [126]:
import sys
print('STATUS: Reading from stdin', file=sys.stderr)
data = sys.stdin.read()
print('STATUS: Writing data to stdout', file=sys.stderr)
sys.stdout.write(data)
sys.stdout.flush()
print('STATUS: Done', file=sys.stderr)

STATUS: Reading from stdin
STATUS: Writing data to stdout
STATUS: Done


# Exit Code
* Convention that a program exits with 0 as error code on clean exit
* Non-zero for other exits
* Example: Write a program that expects a number less than 5 as input

In [None]:
import sys
num = int(sys.argv[1]);
if(num<5):
    sys.exit(0);
else:
    sys.exit(1);

# Memory Management
* `sys.getrefcount()`
* `sys.getsizeof()`

# Refcount

* Python references to an object are counted
* Object is deleted automatically if refcount hits 0

In [79]:
import sys
one = []
print('At start         :', sys.getrefcount(one))
two = one
print('Second reference :', sys.getrefcount(one))
del two
print('After del        :', sys.getrefcount(one))

At start         : 2
Second reference : 3
After del        : 2


# Object Sizes

In [80]:
import sys

for obj in [ [], (), {}, 'c', 'string', 1, 2.3,
         ]:
    print('%10s : %s' % (type(obj).__name__, sys.getsizeof(obj)))

      list : 64
     tuple : 48
      dict : 288
       str : 50
       str : 55
       int : 28
     float : 24


* Caution: Reporting size of a custom class does not include size of the attributes

# sizeof Attributes

In [82]:
import sys
class myClass(object):
     def __init__(self):
         self.a='a';
         self.b=int(120)
         self.c=float(10.5)

myInst = myClass()
myInst.__dict__.values()

print(sum(sys.getsizeof(v) for v in myInst.__dict__.values()))

102


## Exercise
* Should write `__sizeof__` so that it adds up
* Get size of the custom class that we wrote before
* Use `object.__sizeof__(self)` to get size of the `object` class. sys.getsizeof(myInst) should work

# Ints and Floats
* `sys.maxint`
* `sys.maxsize`
* `sys.float_info.*` - Has several important constants related to floating point values - Stuff usually found in float.h

# Modules
* `sys.modules.keys()` - All the modules included so far, changes dynamically
* `sys.builtin_module_names` - All builtin modules
* `sys.path` - Path of the modules

# URL Library
* urllib
  - Provides powerful methods for accessing URLs

# Example
* Let us get class web page and some relevant information from Madhu Belur's course page

In [2]:
from urllib.request import urlopen

response = urlopen("http://www.ee.iitb.ac.in/~belur/sdes")
#print(response)
print(response.headers)
#print(response.headers['Date'])
#print(response.info())
#text = response.readlines()
#print(text)

Date: Mon, 29 Aug 2016 12:02:06 GMT
Server: Apache/2.4.6 (Red Hat Enterprise Linux) OpenSSL/1.0.1e-fips mod_perl/2.0.9dev Perl/v5.16.3
Last-Modified: Tue, 02 Aug 2016 08:25:50 GMT
ETag: "4047-5391276dddf80"
Accept-Ranges: bytes
Content-Length: 16455
Connection: close
Content-Type: text/html; charset=UTF-8


