The str() function is meant to return representations of values which are fairly human-readable, while repr() is meant to generate representations which can be read by the interpreter (or will force a SyntaxError if there is no equivalent syntax). For objects which don’t have a particular representation for human consumption, str() will return the same value as repr().

In [1]:
s = 'Hello, world.'
str(s)


'Hello, world.'

In [2]:
repr(s)

"'Hello, world.'"

In [3]:
str(1.0/7.0)

'0.142857142857'

In [4]:
repr(1.0/7.0)

'0.14285714285714285'

In [5]:
1.0/7.0

0.14285714285714285

In [6]:
x = 10 * 3.25
y = 200 * 200
s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
print s

The value of x is 32.5, and y is 40000...


In [11]:
# The repr() of a string adds string quotes and backslashes:
hello = 'hello, world\n'
hellos = repr(hello)

In [12]:
print hellos

'hello, world\n'


In [13]:
print hello

hello, world



### Here are two ways to write a table of squares and cubes:

In [14]:
for x in range(1, 11):
    print repr(x).rjust(2), repr(x*x).rjust(3),
    # Note trailing comma on previous line
    print repr(x*x*x).rjust(4)

 1   1    1
 2   4    8
 3   9   27
 4  16   64
 5  25  125
 6  36  216
 7  49  343
 8  64  512
 9  81  729
10 100 1000


In [25]:
for x in range(1,11):
    print '{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x)

 1   1    1
 2   4    8
 3   9   27
 4  16   64
 5  25  125
 6  36  216
 7  49  343
 8  64  512
 9  81  729
10 100 1000


In [31]:
for x in range(1,11):
    print '{0:1d} {1:4d} {2:10d}'.format(x, x*x, x*x*x)

1    1          1
2    4          8
3    9         27
4   16         64
5   25        125
6   36        216
7   49        343
8   64        512
9   81        729
10  100       1000


In [33]:
for x in range(1,11):
    print '{0:d} {1:d} {2:d}'.format(x, x*x, x*x*x)

1 1 1
2 4 8
3 9 27
4 16 64
5 25 125
6 36 216
7 49 343
8 64 512
9 81 729
10 100 1000


In [38]:
for x in range(1,11):
    print '{0:2d} {1:d} {2:6d}'.format(x, x*x, x*x*x)

 1 1      1
 2 4      8
 3 9     27
 4 16     64
 5 25    125
 6 36    216
 7 49    343
 8 64    512
 9 81    729
10 100   1000


#### Right and Left Justified

In [17]:
repr(x)

'10'

In [19]:
repr(x).rjust(10)

'        10'

In [23]:
repr(x).ljust(10)

'10        '

In [21]:
for x in range(1, 11):
    print repr(x).rjust(2), repr(x*x).rjust(10),
    # Note trailing comma on previous line
    print repr(x*x*x).rjust(4)

 1          1    1
 2          4    8
 3          9   27
 4         16   64
 5         25  125
 6         36  216
 7         49  343
 8         64  512
 9         81  729
10        100 1000


This example demonstrates the str.rjust() method of string objects, which right-justifies a string in a field of a given width by padding it with spaces on the left. There are similar methods str.ljust() and str.center(). These methods do not write anything, they just return a new string. If the input string is too long, they don’t truncate it, but return it unchanged; this will mess up your column lay-out but that’s usually better than the alternative, which would be lying about a value. (If you really want truncation you can always add a slice operation, as in x.ljust(n)[:n].)

In [43]:
x

1

In [46]:
n = 1
for x in range(1, 11):
    print repr(x).ljust(n)[:n]

1
2
3
4
5
6
7
8
9
1


In [48]:
names = ["Sirius","Polaris","Vega"]

In [49]:
n = 4
for x in names:
    print repr(x).ljust(n)[:n]

'Sir
'Pol
'Veg


In [50]:
for x in names:
    print repr(x).ljust(4)

'Sirius'
'Polaris'
'Vega'


In [56]:
for x in names:
    print repr(x).rjust(20)

            'Sirius'
           'Polaris'
              'Vega'


In [65]:
for x in names:
    print repr(x).center(60)

                          'Sirius'                          
                         'Polaris'                          
                           'Vega'                           


There is another method, str.zfill(), which pads a numeric string on the left with zeros. It understands about plus and minus signs:

In [67]:
'12'.zfill(5)

'00012'

In [68]:
'-3.14'.zfill(7)

'-003.14'

In [69]:
'3.14159265359'.zfill(5)  # there are > 5 so no 0's

'3.14159265359'

In [70]:
print '{0} and {1}'.format('spam', 'eggs')

spam and eggs


In [71]:
print '{1} and {0}'.format('spam', 'eggs')

eggs and spam


In [72]:
print '{0} and {1}'.format(200, 'eggs')

200 and eggs


In [82]:
params = ["alpha","beta","epeak","norm"]
alpha, beta, epeak, norm = [-1.25, -2.34, 524.50, 0.07242]
model = "band"

In [None]:
#alpha, beta, epeak, norm = [-1.25, -2.34, 524.50, 0.07242]

In [86]:
print '{0} are {1}'.format(params, [alpha, beta, epeak, norm])

['alpha', 'beta', 'epeak', 'norm'] are [-1.25, -2.34, 524.5, 0.07242]


If keyword arguments are used in the str.format() method, their values are referred to by using the name of the argument.

In [87]:
print 'This {food} is {adjective}.'.format(
      food='spam', adjective='absolutely horrible')


This spam is absolutely horrible.


'!s' (apply str()) and '!r' (apply repr()) can be used to convert the value before it is formatted.



In [88]:
import math
print 'The value of PI is approximately {}.'.format(math.pi)

The value of PI is approximately 3.14159265359.


In [89]:
print 'The value of PI is approximately {!r}.'.format(math.pi)

The value of PI is approximately 3.141592653589793.


In [90]:
print 'The value of PI is approximately {!s}.'.format(math.pi)

The value of PI is approximately 3.14159265359.


An optional ':' and format specifier can follow the field name. This allows greater control over how the value is formatted. The following example rounds Pi to three places after the decimal.

In [91]:
print 'The value of PI is approximately {0:.3f}.'.format(math.pi)

The value of PI is approximately 3.142.


In [93]:
print 'The value of PI is approximately {0:.5f}.'.format(math.pi)

The value of PI is approximately 3.14159.


In [97]:
print 'The value of PI is approximately {0:.5}.'.format(math.pi)

The value of PI is approximately 3.1416.


Without the f for float,0:.5 will print a total of 5 numbers (excluding the . )

In [102]:
flights = {'Delta': 127, 'American Airlines': 4098, 'Alaska Airlines': 678}
for company, plane_num in flights.items():
    print '{0:10} ==> {1:10d}'.format(company, plane_num)
    
# 1:10d required for numbers, but not strings

Alaska Airlines ==>        678
American Airlines ==>       4098
Delta      ==>        127


In [126]:
flights = {'Delta': '127', 'American Airlines': '4098', 'Alaska Airlines': '678'}

In [161]:
for company, plane_num in flights.items():
    plane_num_2 = plane_num.zfill(4)
    print '{0:20} ==> {1:}'.format(company, plane_num_2)

Alaska Airlines      ==> 0678
American Airlines    ==> 4098
Delta                ==> 0127


If you have a really long format string that you don’t want to split up, it would be nice if you could reference the variables to be formatted by name instead of by position. This can be done by simply passing the dict and using square brackets '[ ]' to access the keys

In [176]:
table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
print 'Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table)

Jack: 4098; Sjoerd: 4127; Dcab: 8637678


## JSON - JavaScript Object Notation

json can take Python data hierarchies, and convert them to string representations; this process is called serializing. Reconstructing the data from the string representation is called deserializing. Between serializing and deserializing, the string representing the object may have been stored in a file or data, or sent over a network connection to some distant machine.

In [178]:
import json

In [191]:
f = open('workfile', 'w')
    #read_data = f.read()
#f.closed


In [192]:
table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}

In [193]:
models = ["band", "sbpl", "copl", "bbody"]

In [194]:
json.dump([table,models], f)

In [195]:
f.close()

In [179]:
# json.dumps([1, 'simple', 'list'])

'[1, "simple", "list"]'

In [197]:
f = open('workfile', 'r')

In [198]:
x = json.load(f)

In [199]:
x

[{u'Dcab': 8637678, u'Jack': 4098, u'Sjoerd': 4127},
 [u'band', u'sbpl', u'copl', u'bbody']]

In [200]:
len(x)

2

In [209]:
list(x[1])

[u'band', u'sbpl', u'copl', u'bbody']

In [224]:
tuple(x[1])

(u'band', u'sbpl', u'copl', u'bbody')

In [225]:
f.close()

## More advanced with JSON

In [227]:
a = json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])

In [228]:
a

'["foo", {"bar": ["baz", null, 1.0, 2]}]'

In [229]:
len(a) #it's all a string at the moment.

39

In [230]:
print json.dumps("\"foo\bar")

"\"foo\bar"


In [231]:
print("\"foo\bar")

"fooar


In [232]:
print json.dumps(u'\u1234')

"\u1234"


In [233]:
print json.dumps('\\')

"\\"


In [234]:
print json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True)

{"a": 0, "b": 0, "c": 0}


In [235]:
from StringIO import StringIO
io = StringIO()
json.dump(['streaming API'], io)
io.getvalue()

'["streaming API"]'

In [237]:
kz = StringIO()
json.dump(['Kim Zoldak'], kz)
kz.getvalue()

'"Kim Zoldak"'

In [238]:
kz = StringIO()
json.dump(['Kim Zoldak', 'Sirius Zoldak', 'Derek Meyers'], kz)
kz.getvalue()

'["Kim Zoldak", "Sirius Zoldak", "Derek Meyers"]'

In [243]:
kz.len # since you can't do len(kz) with a StringIO()

47

### Compact encoding:

In [244]:
json.dumps([1,2,3,{'4': 5, '6': 7}], separators=(',',':'))

'[1,2,3,{"4":5,"6":7}]'

vs. 

In [245]:
json.dumps([1,2,3,{'4': 5, '6': 7}])

'[1, 2, 3, {"4": 5, "6": 7}]'

### Pretty printing:

In [248]:
import json
print json.dumps({'Kim': 5, 'Derek': 7}, sort_keys=True,
                 indent=4, separators=(',', ': '))

{
    "Derek": 7,
    "Kim": 5
}


In [249]:
a = json.dumps({'Kim': 5, 'Derek': 7}, sort_keys=True,
                 indent=4, separators=(',', ': '))

In [251]:
a

'{\n    "Derek": 7,\n    "Kim": 5\n}'

#### Parsing json

In [260]:
json_string = '{"first_name": "Kim", "last_name":"Zoldak"}'

In [261]:
json_string

'{"first_name": "Kim", "last_name":"Zoldak"}'

In [262]:
parsed_json = json.loads(json_string)

In [263]:
parsed_json

{u'first_name': u'Kim', u'last_name': u'Zoldak'}

In [264]:
parsed_json['first_name']

u'Kim'

In [265]:
print(parsed_json['first_name'])

Kim


In [286]:
data = {
    'first_name': 'Kim',
    'last_name': 'Zoldak',
    'titles': ['Astronomer', 'Meteorologist'],
    'age': 28
}


In [287]:
data

{'first_name': 'Kim',
 'last_name': 'Zoldak',
 'titles': ['Astronomer', 'Meteorologist']}

In [288]:
data['titles']

['Astronomer', 'Meteorologist']

In [289]:
data['first_name']

'Kim'

In [290]:
print(json.dumps(data))

{"first_name": "Kim", "last_name": "Zoldak", "titles": ["Astronomer", "Meteorologist"]}


In [297]:
f = open("workfile_2", 'w')  # 'w', 'r', 'a' write, read, append

In [298]:
json.dump(data, f)  

In [299]:
f.close()

In [301]:
data = {
    'first_name': ['Kim', 'Stef', 'Derek'],
    'last_name': ['Zoldak','Lane','Meyers'],
    'titles': [['Astronomer', 'Meteorologist'],['Hist. Teach.', 'Bball Coach'],
               ['Physicist','Condensed Matter Phys']],
    'age': [28, 30, 30]
}


In [302]:
data['first_name']

['Kim', 'Stef', 'Derek']

In [303]:
data['last_name']

['Zoldak', 'Lane', 'Meyers']

In [304]:
data['titles']

[['Astronomer', 'Meteorologist'],
 ['Hist. Teach.', 'Bball Coach'],
 ['Physicist', 'Condensed Matter Phys']]

In [305]:
data['age']

[28, 30, 30]

In [306]:
len(data)

4

In [319]:
data.keys()

['first_name', 'last_name', 'titles', 'age']

In [322]:
data.has_key('first_name')

True

In [329]:
data.keys()

['first_name', 'last_name', 'titles', 'age']

In [331]:
data.items()

[('first_name', ['Kim', 'Stef', 'Derek']),
 ('last_name', ['Zoldak', 'Lane', 'Meyers']),
 ('titles',
  [['Astronomer', 'Meteorologist'],
   ['Hist. Teach.', 'Bball Coach'],
   ['Physicist', 'Condensed Matter Phys']]),
 ('age', [28, 30, 30])]

In [335]:
data.viewkeys()

dict_keys(['first_name', 'last_name', 'titles', 'age'])

In [341]:
data.values()

[['Kim', 'Stef', 'Derek'],
 ['Zoldak', 'Lane', 'Meyers'],
 [['Astronomer', 'Meteorologist'],
  ['Hist. Teach.', 'Bball Coach'],
  ['Physicist', 'Condensed Matter Phys']],
 [28, 30, 30]]

In [339]:
data.viewvalues()

dict_values([['Kim', 'Stef', 'Derek'], ['Zoldak', 'Lane', 'Meyers'], [['Astronomer', 'Meteorologist'], ['Hist. Teach.', 'Bball Coach'], ['Physicist', 'Condensed Matter Phys']], [28, 30, 30]])

In [342]:
data = [ { 'a':'A', 'b':(2, 4), 'c':3.0 } ]
print 'DATA:', repr(data)

DATA: [{'a': 'A', 'c': 3.0, 'b': (2, 4)}]


In [343]:
data_string = json.dumps(data)
print 'JSON:', data_string

JSON: [{"a": "A", "c": 3.0, "b": [2, 4]}]


### json.dumps vs json.dump

json.dumps is used when assigning to a variable within the current python environment.  json.dump is when you are writing it to a file.

Same thing goes for load.

### Using the following site
https://pymotw.com/2/json/

### Encoding, then re-decoding may not give exactly the same type of object.

In [346]:
data = [ { 'a':'A', 'b':(2, 4), 'c':3.0 } ]

In [347]:
data_string = json.dumps(data)  
print 'ENCODED:', data_string

ENCODED: [{"a": "A", "c": 3.0, "b": [2, 4]}]


In [348]:
decoded = json.loads(data_string)
print 'DECODED:', decoded

DECODED: [{u'a': u'A', u'c': 3.0, u'b': [2, 4]}]


In [349]:
print 'ORIGINAL:', type(data[0]['b'])  # tuple
print 'DECODED :', type(decoded[0]['b'])  # list

ORIGINAL: <type 'tuple'>
DECODED : <type 'list'>


The dumps() function accepts several arguments to make the output even nicer. For example, sort_keys tells the encoder to output the keys of a dictionary in sorted, instead of random, order.

In [359]:
data = [ { 'a':'A', 'b':(2, 4), 'c':3.0 } ]
print 'DATA:', repr(data)

DATA: [{'a': 'A', 'c': 3.0, 'b': (2, 4)}]


In [360]:
unsorted = json.dumps(data)
print 'JSON:', json.dumps(data)
print 'SORT:', json.dumps(data, sort_keys=True)

JSON: [{"a": "A", "c": 3.0, "b": [2, 4]}]
SORT: [{"a": "A", "b": [2, 4], "c": 3.0}]


In [361]:
first = json.dumps(data, sort_keys=True)
second = json.dumps(data, sort_keys=True)

In [362]:
first

'[{"a": "A", "b": [2, 4], "c": 3.0}]'

In [363]:
second

'[{"a": "A", "b": [2, 4], "c": 3.0}]'

In [364]:
print 'UNSORTED MATCH:', unsorted == first
print 'SORTED MATCH  :', first == second

UNSORTED MATCH: False
SORTED MATCH  : True


In [365]:
first == second # True. first and second are teh same.  Both sorted.

True

In [366]:
data == first  # False bc data is unsorted and first is sorted

False

For highly-nested data structures, you will want to specify a value for indent, so the output is formatted nicely as well.

In [370]:
data = [ { 'a':'A', 'b':(2, 4), 'c':3.0 } ]
print 'DATA UNSORTED:', repr(data)

DATA UNSORTED: [{'a': 'A', 'c': 3.0, 'b': (2, 4)}]


In [371]:
print 'NORMAL SORTED:', json.dumps(data, sort_keys=True)

NORMAL SORTED: [{"a": "A", "b": [2, 4], "c": 3.0}]


In [372]:
print 'INDENT SORTED:', json.dumps(data, sort_keys=True, indent=2)

INDENT SORTED: [
  {
    "a": "A", 
    "b": [
      2, 
      4
    ], 
    "c": 3.0
  }
]


In [373]:
print 'INDENT SORTED:', json.dumps(data, sort_keys=True, indent=5)

INDENT SORTED: [
     {
          "a": "A", 
          "b": [
               2, 
               4
          ], 
          "c": 3.0
     }
]


Verbose output like this increases the number of bytes needed to transmit the same amount of data, however, so it isn’t the sort of thing you necessarily want to use in a production environment. In fact, you may want to adjust the settings for separating data in the encoded output to make it even more compact than the default.

In [404]:
data = [ { 'a':'A', 'b':(2, 4), 'c':3.0 } ]
print 'DATA:', repr(data)
print 'repr(data)             :', len(repr(data))
print 'dumps(data)            :', len(json.dumps(data))
print 'dumps(data, indent=2)  :', len(json.dumps(data, indent=2))
print 'dumps(data, separators):', len(json.dumps(data, separators=(',',':')))

DATA: [{'a': 'A', 'c': 3.0, 'b': (2, 4)}]
repr(data)             : 35
dumps(data)            : 35
dumps(data, indent=2)  : 76
dumps(data, separators): 29


The JSON format expects the keys to a dictionary to be strings. If you have other types as keys in your dictionary, trying to encode the object will produce a ValueError. One way to work around that limitation is to skip over non-string keys using the skipkeys argument:

In [405]:
data = [ { 'a':'A', 'b':(2, 4), 'c':3.0, ('d',):'D tuple' } ]

In [406]:
data

[{'a': 'A', 'b': (2, 4), 'c': 3.0, ('d',): 'D tuple'}]

In [407]:
print 'First attempt'
try:
    print json.dumps(data)
except (TypeError, ValueError) as err:
    print 'ERROR:', err

print
print 'Second attempt'
print json.dumps(data, skipkeys=True)

First attempt
ERROR: keys must be a string

Second attempt
[{"a": "A", "c": 3.0, "b": [2, 4]}]


Rather than raising an exception, the non-string key is simply ignored.

In [408]:
# print json.dumps(data) # run this to see the error message

### Working with Your Own Types

All of the examples so far have used Pythons built-in types because those are supported by json natively. It isn’t uncommon, of course, to have your own types that you want to be able to encode as well. There are two ways to do that.

First, we’ll need a class to encode:

In [432]:
class MyObj(object):
    def __init__(self, s):
        self.s = s
    def __repr__(self):
        return '< Name is %s >' % self.s

The simple way of encoding a MyObj instance is to define a function to convert an unknown type to a known type. You don’t have to do the encoding yourself, just convert one object to another.

In [433]:
# import json_myobj

In [434]:
obj = MyObj('kim')

In [435]:
obj

< Name is kim >

In [436]:
obj.s

'kim'

In [437]:
print 'First attempt'
try:
    print json.dumps(obj)
except TypeError, err:
    print 'ERROR:', err

def convert_to_builtin_type(obj):
    print 'default(', repr(obj), ')'
    # Convert objects to a dictionary of their representation
    d = { '__class__':obj.__class__.__name__, 
          '__module__':obj.__module__,
          }
    d.update(obj.__dict__)
    return d

print
print 'With default'
print json.dumps(obj, default=convert_to_builtin_type)

First attempt
ERROR: < Name is kim > is not JSON serializable

With default
default( < Name is kim > )
{"s": "kim", "__module__": "__main__", "__class__": "MyObj"}


In [438]:
obj.__class__.__name__

'MyObj'

In [439]:
obj.__module__ # if this were imported from say, numpy it'd say numpy here.

'__main__'

In [454]:
obj.__dict__

{'s': 'kim'}

In [445]:
import numpy as np
a = np.arange(0,6,1) # array from 0 to 5 with step = 1.

In [448]:
a

array([0, 1, 2, 3, 4, 5])

In [451]:
np.array.__module__

'numpy.core.multiarray'

In [452]:
np.arange.__module__

'numpy.core.multiarray'

In [453]:
np.matrix.__module__

'numpy.matrixlib.defmatrix'