# Data types and data structures

We illustrate the use of strings, lists, tuples, dictionaries and sets in this session

## Strings

You can create a sequence of type string by enclosing the characters between quotes.

In [1]:
s1 = "Here is a string"

The builtin function len gives the number of items in the sequence. For a string, it will be the number of characters.

In [2]:
len(s1)

16

Accessing the individual items in the sequence is by using square brackets. The numbering starts with 0.

In [3]:
s1[3]

'e'

One can take a slice of a sequence using two integers that indicate start and end positions within the sequence. Start is included and end is not. One can use negative numbers to indicate the position as referred from the end of the sequence.

In [4]:
s1[0:4]

'Here'

In [5]:
s1[4:-2]

' is a stri'

To see what are the functions available to work on the string, look it up using the dir function.

In [6]:
dir(s1)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',


In [7]:
s1.lower()

'here is a string'

In [8]:
s1.upper()

'HERE IS A STRING'

In [9]:
s1.swapcase()

'hERE IS A STRING'

The plus operator works to concatenate the sequence.

In [10]:
s2 = "...This is also a string"
s1 + s2

'Here is a string...This is also a string'

## Lists

One can create a list using square brackets. The items can be of any type and can be mixed up too.

In [11]:
a1=[1,2,3,4,5]

In [12]:
len(a1)

5

The range function gives out a list of integers starting from 0 until the number provided. In newer versions of python, this works differently as range() returns a separate data type. You can pass it to the creator of list and get the same output as in older versions. If the following command returns an output that does not look like a list of numbers from 0 to 9, then use the command below that.

In [13]:
range(10)

range(0, 10)

In [14]:
list(range(0,10,1))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [15]:
help(range)

Help on class range in module builtins:

class range(object)
 |  range(stop) -> range object
 |  range(start, stop[, step]) -> range object
 |  
 |  Return an object that produces a sequence of integers from start (inclusive)
 |  to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
 |  start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
 |  These are exactly the valid indices for a list of 4 elements.
 |  When step is given, it specifies the increment (or decrement).
 |  
 |  Methods defined here:
 |  
 |  __bool__(self, /)
 |      self != 0
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(self, key, /)
 |      Return self[key].
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __hash__(self, /)
 |

In [16]:
a1=list(range(5,100,5))

In [17]:
a1

[5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95]

In [18]:
r1=[1.1, 2.2, 3.3, 4.4, 5.5]

In [19]:
r1

[1.1, 2.2, 3.3, 4.4, 5.5]

In [20]:
c1=['a', 'b', 'c', 'd']

In [21]:
c1

['a', 'b', 'c', 'd']

Lists can be made of a mixture of different types of items.

In [22]:
m1=[1, 1.1, '1']

In [23]:
m1

[1, 1.1, '1']

Addition operator works to achieve concatenation of the lists. The behavior is different in other languages such as octave.

In [24]:
j1 = r1 + c1

In [25]:
j1

[1.1, 2.2, 3.3, 4.4, 5.5, 'a', 'b', 'c', 'd']

In [26]:
type(j1)

list

You can check if a value is there in a list or not. The output is boolean.

In [27]:
'b' in j1

True

In [28]:
'x' in j1

False

You can remove items from a list using the function that comes along with the list object.

In [29]:
j1.remove(4.4)

In [30]:
j1

[1.1, 2.2, 3.3, 5.5, 'a', 'b', 'c', 'd']

In [31]:
j1[3]

5.5

An elegant error message is thrown out when you try to delete an item that does not exist in a list.

In [32]:
j1.remove(3.1415)

ValueError: list.remove(x): x not in list

Built in functions for numerically minimum and maximum values in a list are available.

In [33]:
min(r1)

1.1

In [34]:
max(r1)

5.5

Taking a slice of a list is same as in a string.

In [35]:
j1[1:-2]

[2.2, 3.3, 5.5, 'a', 'b']

## Tuples

Tuples are created using parantheses. The items inside a tuple are not mutable. Otherwise a tuple is very much like a list.

In [36]:
t1 = (1.0, 2.0, 'a', 'b')

In [37]:
t1

(1.0, 2.0, 'a', 'b')

In [38]:
type(t1)

tuple

You can modify value of an item in a list but not of an item in a tuple.

In [39]:
j1[1]=2.3

In [40]:
j1

[1.1, 2.3, 3.3, 5.5, 'a', 'b', 'c', 'd']

If you try to change an item of a tuple, you will get an error as shown below.

In [41]:
t1[1]=2.2

TypeError: 'tuple' object does not support item assignment

## Dictionaries

Dictionaries are associative arrays or hashes. They are a bunch of key,value pairs with no particular sequence of arrangement. One can create a dictionary using flower brackets. The key,value pair can be given using a colon that separates the key and value. You can also give these values using the dict creator function.

In [42]:
d1 = {'a':1.0, 'b':2.0}

In [43]:
type(d1)

dict

In [44]:
d2 = dict(c=3.0, d=4.0)

In [45]:
d2

{'c': 3.0, 'd': 4.0}

In [46]:
type(d2)

dict

In [47]:
d1.values()

dict_values([1.0, 2.0])

In [48]:
d1.keys()

dict_keys(['a', 'b'])

In [49]:
type(d1.keys())

dict_keys

In [50]:
type(d1.values())

dict_values

New items can get added to the dictionary as you create new pairs of key, value.

In [51]:
d1['pi'] = 3.1415

In [52]:
d1

{'a': 1.0, 'b': 2.0, 'pi': 3.1415}

One can inquire if there is a particular key in the dictionary. This complements the "in" functionality of lists. In the newer version of python, this function has_key() is depracated. One can use the "in" functionality for dictionaries too.

In [53]:
'a' in d1

True

In [54]:
d1['a']

1.0

Dictionary can contain items that are lists too.

In [55]:
d1['mylist']=[1.0, 2.0]

In [56]:
d1

{'a': 1.0, 'b': 2.0, 'pi': 3.1415, 'mylist': [1.0, 2.0]}

Items in a dictionary can be lists themselves. Such lists need not be of uniform length across items in the dictionary. We enclose range function inside a list constructor to avoid issues with new version of python.

In [57]:
d1['mylonglist'] = list(range(5,25))

In [58]:
d1

{'a': 1.0,
 'b': 2.0,
 'pi': 3.1415,
 'mylist': [1.0, 2.0],
 'mylonglist': [5,
  6,
  7,
  8,
  9,
  10,
  11,
  12,
  13,
  14,
  15,
  16,
  17,
  18,
  19,
  20,
  21,
  22,
  23,
  24]}

In [59]:
dir(d1)

['__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'clear',
 'copy',
 'fromkeys',
 'get',
 'items',
 'keys',
 'pop',
 'popitem',
 'setdefault',
 'update',
 'values']

The temporary variable to be used for iteration across all keys can be of any name. The function keys() provided along with the object of type dictionary returns a list that helps the for loop run over. The following may not work in newer version of python.

In [60]:
for chabi in d1.keys():
    print(chabi, "=>", d1[chabi])

a => 1.0
b => 2.0
pi => 3.1415
mylist => [1.0, 2.0]
mylonglist => [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]


In the newer version of python, you can enclose the output of keys in list. The following should work for new version of python too.

In [61]:
for chabi in list(d1.keys()):
    print(chabi, "=>", d1[chabi])

a => 1.0
b => 2.0
pi => 3.1415
mylist => [1.0, 2.0]
mylonglist => [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]


## Set

Set is an immutable collection of items without an index.

In [62]:
set1 = {'a', 3.14, True, "set-of-stuff"}

In [63]:
len(set1)

4

In [64]:
type(set1)

set

The class 'set' does not support indexing. Try and see.

In [65]:
set1[0]

TypeError: 'set' object is not subscriptable

One can update a set to add elements

In [66]:
set1.update([1, 2, 'hello'])

In [67]:
set1

{2, 3.14, True, 'a', 'hello', 'set-of-stuff'}

You can delete an item using discard or remove. The function discard is better as it will not complain if the element is not present in the set.

In [68]:
set1.discard('hello')

In [69]:
set1

{2, 3.14, True, 'a', 'set-of-stuff'}

In [70]:
set2 = {"set2.e1", "set2.e2"}

Set operations like union, intersection, difference etc., are also available

In [71]:
set1.union(set2)

{2, 3.14, True, 'a', 'set-of-stuff', 'set2.e1', 'set2.e2'}