# Python Data Structures Tutorial

#### *Get introduced to Python data structures: learn more about data types and primitive as well as non-primitive data structures, such as strings, lists, tuples, dictionaries , etc.*

Data structures are a way of organizing and storing data so that they can be accessed and worked with efficiently. They define the relationship between the data, and the operations that can be performed on the data. There are many kinds of data structures defined in Python that make it easier for the data scientists and the computer engineers, to concentrate on solving larger problems rather than getting lost in the details of data description and access.

In this tutorial, you'll learn about the various Python data structures and see how they are implemented and this tutorial is split in to two Notebooks:

- Notebook1 : Primitive-Data-Structures (this notebook)
- Notebook2 : Non-Primitive-Data-Structures  [here](http://localhost:8888/notebooks/python-language-notebooks/data-structures.ipynb)

### Abstract Data Type and Data Structures
As you read in the introduction, data structures help you to focus on the bigger picture rather than getting lost in the details. This is known as data abstraction.

Now, data structures are actually an implementation of Abstract Data Types or ADT. This implementation requires a physical view of data using some collection of programming constructs and basic data types.

Generally, data structures can be divided into two categories in computer science: primitive and non-primitive data structures. The former are the simplest forms of representing data, whereas the latter are more advanced: they contain the primitive data structures within more complex data structures for special purposes.

Here in this notebook, we will be discussing primitive data structures implimented in Python, Namely:**Strings,integers,floats and boolean**

## Strings

A string is a sequence of characters. Strings are basically just a bunch of words.

You will be using strings in almost every Python program that you write, so pay attention to the following part.

#### Single quotes

You can specify strings using single quotes such as **'Quote me on this'**.

All white space i.e. spaces and tabs, within the quotes, are preserved as-is.

#### Double quotes

Strings in double quotes work exactly the same way as strings in single quotes. An example is **"What's your name?"**.

#### Triple quotes

You can specify multi-line strings using triple quotes - (""" or '''). You can use single quotes and double quotes freely within the triple quotes. An example is:

In [30]:
'''This is a multi-line string. This is the first line.
This is the second line.
"What's your name?," I asked.
He said "Bond, James Bond."
'''

'This is a multi-line string. This is the first line.\nThis is the second line.\n"What\'s your name?," I asked.\nHe said "Bond, James Bond."\n'

### Strings Are Immutable

This means that once you have created a string, you cannot change it. Although this might seem like a bad thing, it really isn't. We will see why this is not a limitation in the various programs that we see later on.

#### *Note for C/C++ Programmers*

There is no separate **char** data type in Python. There is no real need for it and I am sure you won't miss it.

 ##### *Note for Perl/PHP Programmers*

Remember that single-quoted strings and double-quoted strings are the same - they do not differ in any way.

### The format method

Sometimes we may want to construct strings from other information. This is where the **format( )** method is useful.

In [31]:
age = 58
name = 'Setty'

print('{0} was {1} years old when he strated ML and DL practice'.format(name, age))


print('Why is {0} learning and teaching ML and DL at this age?'.format(name))


Setty was 58 years old when he strated ML and DL practice
Why is Setty learning and teaching ML and DL at this age?


##### How It Works

A string can use certain specifications and subsequently, the format method can be called to substitute those specifications with corresponding arguments to the format method.

Observe the first usage where we use **{0}** and this corresponds to the variable name which is the first argument to the format method. Similarly, the second specification is **{1}** corresponding to age which is the second argument to the format method. Note that Python starts counting from 0 which means that first position is at index 0, second position is at index 1, and so on.

Notice that we could have achieved the same using **string concatenation:**

In [32]:
name + ' is ' + str(age) + ' years old'

'Setty is 58 years old'

but that is much uglier and error-prone. Second, **the conversion to string for age numeric variable would be done automatically by the format method instead of the explicit conversion to strings needed in this case.** Third, when using the format method, we can change the message without having to deal with the variables used and vice-versa.

Also note that the numbers are optional, so you could have also written as:

In [33]:
age = 58
name = 'Setty'

print('{} was {} years old when he strated ML and DL practice'.format(name, age))


print('Why is {} learning and teaching ML and DL at this age?'.format(name))


Setty was 58 years old when he strated ML and DL practice
Why is Setty learning and teaching ML and DL at this age?


which will give the same exact output as the previous program.

We can also name the parameters:

In [34]:
age = 58
name = 'Setty'

print('{my_name} was {my_age} years old when he strated ML and DL practice'.format(my_name=name, my_age=age))


print('Why is {my_name} learning and teaching ML and DL at this age?'.format(my_name=name))


Setty was 58 years old when he strated ML and DL practice
Why is Setty learning and teaching ML and DL at this age?


which will give the same exact output as the previous program.


Python 3.6 introduced a shorter way to do named parameters, called "f-strings":

In [35]:
age = 58
name = 'Setty'

print(f'{name} was {age} years old when he strated ML and DL practice')

# notice 'f' before the string

print(f'Why is {name} learning and teaching ML and DL at this age?')

Setty was 58 years old when he strated ML and DL practice
Why is Setty learning and teaching ML and DL at this age?


What Python does in the **format method** is that it substitutes each argument value into the place of the specification. There can be more detailed specifications such as:

In [36]:
# decimal (.) precision of 3 for float '0.333'

print('{0:.3f}'.format(1.0/3))


# keyword-based 'Setty train ML and DL'

print('{name} train {subject}'.format(name='Setty', subject='ML and DL'))

0.333
Setty train ML and DL


Since we are discussing formatting, note that print always ends with an invisible "new line" character **(\n)** so that repeated calls to print will all print on a separate line each. To prevent this newline character from being printed, you can specify that it should **end** with a blank:

In [37]:
print('a', end='')
print('b', end='')

ab

Or you can end with a space:

In [38]:
print('a', end=' ')
print('b', end=' ')
print('c')

a b c


You can also apply the + operations on two or more strings to **concatenate** them, just like in the example below:

In [39]:
x = 'Cake'
y = 'Cookie'
x + ' & ' + y

'Cake & Cookie'

Here are some other basic operations that you can perform with strings; For example, you can use ***** to repeat a string a certain number of times:

In [40]:
# repeat string multiple times
name = "Modi"
print(name*10)

ModiModiModiModiModiModiModiModiModiModi


***You can also slice strings, which means that you can select parts of strings:***

In [41]:
# Range Slicing
z1 = x[2:] 

print(z1)

# Slicing
z2 = y[0] + y[1] 

print(z2)

ke
Co


***Note that strings can also be alpha-numeric characters, but that the + operation still is used to concatenate strings.***

In [42]:
x = "4"
y = "6"

print(x+y)

46


Python has many built-in methods or helper functions to manipulate strings.
Please note **by saying strings has buit-in methods,** clearly shows that in python strings are also **objects** and it belongs to **str**class.Infact in Python everything is object oriented including the primitive data types: integers,floats,strings and booleans.

Replacing a substring, capitalising certain words in a paragraph, finding the position of a string within another string are some common string manipulations. Check out some of these:

- **Capitalize strings**

In [2]:
str.upper('ramesh')

'RAMESH'

In [43]:
str.capitalize('cookie')

'Cookie'

- **Retrieve the length of a string in characters. Note that the spaces also count towards the final result:**

In [44]:
str1 = "Cake 4 U"
str2 = "404"
len(str1)

8

- **Check whether a string consists of only digits**

In [45]:
str1.isdigit()

False

In [46]:
str2.isdigit()

True

- **Replace parts of strings with other strings**

In [47]:
str1.replace("4 U",str2)       #str1 is "Cake 4 U" and str2 is "404"

'Cake 404'

- **Find substrings in other strings; Returns the lowest index or position within the string at which the substring is found:**

In [48]:
str1 = "cookie"
str2 = "cook"
str1.find(str2)

0

The substring 'cook' is found at the start of 'cookie'. As a result, you refer to the position within 'cookie' at which you find that substring. In this case, 0 is returned because you start counting positions from 0!

In [49]:
str1 = "I bought some cookies Today"
str1.find(str2)

14

Similarly, the substring 'cook' is found at position 14 within 'I bought some cookies Today'. Remember that you start counting from 0 and that spaces count towards the positions!

You can find an exhaustive list of string methods in Python [here.](https://docs.python.org/3/library/stdtypes.html#string-methods)

## Boolean

This built-in data type that can take up the values: **True** and **False**, which often makes them interchangeable with the integers 1 and 0. Booleans are useful in conditional and comparison expressions, just like in the following examples:

In [50]:
x = 4
y = 2
x == y

False

In [51]:
x>y

True

In [52]:
x = 4
y = 2
z = (x==y)    # Comparison expression (Evaluates to false)
if z:         # Conditional on truth/false value of 'z'
    print("Cookie")
else: print("No Cookie")

No Cookie


### Data Type Conversion

Sometimes, you will find yourself working on someone else's code and you'll need to convert an integer to a float or vice versa, for example. Or maybe you find out that you have been using an integer when what you really need is a float. In such cases, you can convert the data type of variables!

To check the type of an object in Python, use the built-in **type( )**  function, just like in the lines of code below:

In [53]:
i = 4
type(i)

int

##### Implicit Data Type Conversion
This is an automatic data conversion and the compiler handles this for you. Take a look at the following examples:

In [54]:
# A float
x = 4.0 

# An integer
y = 2 

# Divide `x` by `y`
z = x/y

# Check the type of `z`
type(z)

float

In the example above, you did not have to explicitly change the data type of y to perform float value division. The compiler did this for you implicitly.

That's easy!

##### Explicit Data Type Conversion
This type of data type conversion is user defined, which means you have to explicitly inform the compiler to change the data type of certain entities. Consider the code chunk below to fully understand this:

In [55]:
x = 2
y = "Bahubali: part"
favourite_movie = y+x
print(favourite_movie)

TypeError: must be str, not int

The above example gave you an error because the compiler does not understand that you are trying to perform concatenation or addition, because of the mixed data types. You have an integer and a string that you're trying to add together.

There's an obvious mismatch.

To solve this, you'll first need to convert the int to a string to then be able to perform concatenation.

Note that it might not always be possible to convert a data type to another. Some built-in data conversion functions that you can use here are: **int( ), float( ), and str( ).**

In [None]:
x = 2
y = "Bahubali: part"
favourite_movie = y+str(x)
print(favourite_movie)

### Integers
You can use an integer represent numeric data, and more specifically, whole numbers from negative infinity to infinity, like 4, 5, or -1.

### Float
"Float" stands for 'floating point number'. You can use it for rational numbers, usually ending with a decimal figure, such as 1.11 or 3.14.

Take a look at the following DataCamp Light Chunk and try out some of the integer and float operations!

In [None]:
# Floats
x = 4.0
y = 2.0

# Addition
print(x + y)

# Subtraction
print(x - y)

# Multiplication
print(x * y)

# Returns the quotient
print(x / y)

# Returns the remainder
print(x % y) 

# Absolute value
z = -24
print(abs(z))

# x to the power y
print(x ** y)

For a dicussion on non primitive data structures like list,tuple dictionaries etc., see the Notebook on Non-Primitive-Data-stuctures [here ](http://localhost:8888/notebooks/python-language-notebooks/data-structures.ipynb)

### Escape Sequences

Suppose, you want to have a string which contains a single quote ('), how will you specify this string? For example, the string is **"What's your name?"**. You cannot specify **'What's your name?'** because Python will be confused as to where the string starts and ends. So, you will have to specify that this single quote does not indicate the end of the string. This can be done with the help of what is called an escape sequence. You specify the single quote as \' : notice the backslash. Now, you can specify the string as **'What\'s your name?'**.

Another way of specifying this specific string would be **"What's your name?"** i.e. using double quotes. Similarly, you have to use an escape sequence for using a double quote itself in a double quoted string. Also, you have to indicate the backslash itself using the escape sequence \\.

What if you wanted to specify a **two-line string?** One way is to use a triple-quoted string as shown previously or you can use an escape sequence for the newline character - **\n** to indicate the start of a new line. An example is:

In [None]:
print('This is the first line\nThis is the second line')

Another useful escape sequence to know is the tab: **\t**. There are many more escape sequences but I have mentioned only the most useful ones here.

One thing to note is that in a string, a single backslash at the end of the line indicates that the string is continued in the next line, but no newline is added. For example:

In [None]:
print("This is the first sentence. \
This is the second sentence.")

### Variable

Using just literal constants can soon become boring - we need some way of storing any information and manipulate them as well. This is where variables come into the picture. Variables are exactly what the name implies - their value can vary, i.e., you can store anything using a variable. Variables are just parts of your computer's memory where you store some information. Unlike literal constants, you need some method of accessing these variables and hence you give them names.

### Identifier Naming

Variables are examples of identifiers. Identifiers are names given to identify something. There are some rules you have to follow for naming identifiers:

1. The first character of the identifier must be a letter of the alphabet (uppercase ASCII or lowercase ASCII or Unicode character) or an underscore (_).
2. The rest of the identifier name can consist of letters (uppercase ASCII or lowercase ASCII or ---Unicode character), underscores (_) or digits (0-9).
3. Identifier names are case-sensitive. For example, myname and myName are not the same. Note the lowercase n in the former and the uppercase N in the latter.
4. Examples of valid identifier names are i, name_2_3. Examples of invalid identifier names are 2things, this is spaced out, my-name and >a1b2_c3.

### Data Types

Variables can hold values of different types called data types. The basic (primitive) types are numbers(integers/floats) and strings, which we have already discussed above. In later chapters, we will see how to create our own types using classes.

### Object

Remember, Python refers to anything used in a program as an object. This is meant in the generic sense. Instead of saying "the something"', we say "the object".

##### Note for Object Oriented Programming users:

Python is strongly object-oriented in the sense that everything is an object including numbers, strings and functions.