<a href="https://colab.research.google.com/github/tomersk/learn-python/blob/main/02_02.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 2.2 Data structures
Data structures are able to contain more than one data in it. There are four built-in data structures in Python: 

*   list
*   tuple, 
*   dictionary, and
*   set. 

Apart from these built-in data structure, you can define your own data type also like numpy.array defined by numpy, which is very useful. 

### 2.2.1 List
A list is a sequence of items (values). The items in it could belong to any of data type, and could be of different data type in the same list.

In [1]:
a = ['Ram', 'Sita', 'Bangalore', 'Delhi']
print(a)

b = [25, 256, 2656, 0]
print(b)

c = [25, 'Bangalore']
print(c)

['Ram', 'Sita', 'Bangalore', 'Delhi']
[25, 256, 2656, 0]
[25, 'Bangalore']


The items in the list are accessed using the indices. The variable *a* and *b* hold items of similar data
types, while *c* holds items of different data types. In Python, the indices starts at 0. So, to get the
first and third item, the indices should be 0 and 2.

In [2]:
print(a[0])
print(a[2])

Ram
Bangalore


Negative indices are also allowed in Python. The last item in the list has -1 indices, similarly second
last item has indices of -2 and so on.

In [3]:
print(a[-1])

Delhi


Likewise, second last item in the list can be accessed by using the indices -2.

### 2.3.2 Dictionary
In the list, the indices are only integers. Dictionary has the capability to take any data type as indices. This feature of dictionary makes it very suitable, when the indices are name etc. For example, in hydrology the name of field stations and their corresponding variables are given for each station. Let us try to retrieve the value of variable by using list first, and then by using dictionary. We can use one list to store the name of stations, and one for the variable. First, we need to find the indices of station, and then use this indices to access the variable from the list of variables.

In [8]:
names = ['Delhi', 'Bangalore', 'Kolkata']
rainfall = [0, 5, 10]
print(rainfall[1])

5


Now, let us try this using dictionary.

In [9]:
rainfall = {'Delhi':0, 'Bangalore':5, 'Kolkata':10}
print(rainfall['Bangalore'])

5


The same thing could have been done using list in one line, but dictionary provides a neat and clean way to do this.

### 2.2.3 Tuple
A tuple is a sequence of values, similar to list except that tuples are immutable (their value can not be modified).

In [12]:
foo = 5,15,18
print(foo[2])

18


Let us try modifying an item:

In [13]:
foo[1] = 10

TypeError: ignored

While trying to modify the items in the tuple, Python issues an error. Tuples are useful there is a need to specify some constants, and to make sure that these constants do not change. The immutable property of tuples ensures that during executions of the program the value of constants will not change.

A tuple having only one item is defined by using the "," after this, e.g. :

In [14]:
foo = 5
type(foo)

int

In [15]:
foo = 5,
type(foo)

tuple

You might have noticed that without using the comma (,), Python does not take it as tuple. 

### 2.2.4 Numpy.array
NumericalPython (NumPy) is a library/package written mainly in C programming language, but application programming interface (API) is provided for Python. The library provided numpy.array
data type, which is very useful in performing mathematical operation on array. It is the type of data, that we would be dealing most of the time. This library is not a part of the standard Python distribution, hence before using this, NumPy have to be installed in the system. We can check if
NumPy is installed in our system or not, by using the following command:

In [None]:
$ python -c'import numpy'

If you are already into Python (e.g. in Google Colab), you can try importing the library inside Python such as:

In [16]:
import numpy

If any of these command gives no output (no error), then it means that NumPy is installed.

If a library is not installed in the system, you will see some message (error). Let us try to import *ambhas* library: 

In [17]:
import ambhas

ModuleNotFoundError: ignored

This means, that *ambhas* is not installed in the system. You can install *ambhas* by following the steps provided in the section 1.3. 

The python -c'import numpy' is a way to run some simple code
without invoking the python. This is useful when you want to do something small, quickly. This is very helpful when you want to check if some package is installed or not in your system.

Before using any library, it should be imported into the program. The import can be used to import the library. There are three ways to import a complete library or some functions from the library. By importing complete library.

In [None]:
import numpy
x = [1, 2, 5, 9.0, 15] # list containing only numbers (float or integers)
type(x)

x = numpy.array(x) # convert the list into numpy array
type(x)

We imported the complete library numpy, and after doing so, whenever we need any function (i.e. array) from this library, we need to provide name along with the name of library (i.e.
numpy.array). The array function converts a list of integers or/and float into numpy array. Often the library name are quiet long, and it can be abbreviated using as in the following manner.


In [19]:
import numpy as np
x = [1, 2, 5, 9.0, 15]
x = np.array(x) # convert the list into numpy array
type(x)

numpy.ndarray

If only few functions are needed then they can be imported by explicitly defining their name.

In [20]:
from numpy import array
x = array(x) # convert the list into numpy array
type(x)

numpy.ndarray

If all the functions are needed, and you do not want to use numpy or np before them, then you can import in the following way.


In [21]:
from numpy import *
x = array(x) # convert the list into numpy array
type(x)

numpy.ndarray

Anything written after # is comment for program, and Python does not execute them. Comments are useful in making your code more readable. The comments can be in full line also. A numpy
array is a homogeneous multidimensional array. It can hold only integer, only float, only complex numbers or only strings. If combination of integers and float are specified in
numpy.ndarray, then integers are treated as floats. If combination of number and strings are used, then numbers are treated as strings. The data type of numpy.ndarray can be checked using its attribute dtype.

In [31]:
import numpy as np
a = np.array([1,5,9.0,15]) # np.array can be defined directly also
print(a.dtype)

b = np.array([1,5,9,15]) # this is holding only integers
print(b.dtype)

c = np.array(['Delhi', 'Paris']) # this is holding strings
print(c.dtype)

d = np.array([5,'b'])
print(d.dtype)
print(d)

float64
int64
<U5
<U21
['5' 'b']


The mean of the array can be computed using method mean, in the following manner.


In [22]:
import numpy as np
x = np.array([1,5,9.0,15])
x.sum()


30.0

Did you notice the difference between calling attributes and methods? The methods perform some action on the object, and often action needs some input, so methods are called with brackets (). If there is some input to be given to method, it can be given inside brackets, if there is no input then
empty brackets are used. Try using the methods (e.g. sum) without giving the bracket, you will see only some details about it, and no output.

In [32]:
print(x.sum)

<built-in method sum of numpy.ndarray object at 0x7f013db08f90>


As Python is object oriented programming (OOP) language, and attributes and methods are used quiet commonly. It is better to know briefly about them before jumping into Python. Attributes represent properties of an object, and can be of any type, even the type of the object that contains it. methods represent what an object can do. An attribute can only have a value or a state, while a method can do something or perform an action.