<img src='images/logo_full.png'/>

## <p style="text-align: center;"> Python for Data Science </p>
The course is offered by [Ai Adventures](www.aiadventures.in). The notebooks are created by [Pranav Uikey]() and [Ankur Singh](). This material is subject to the terms and conditions of the [Creative Commons CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license. Any use for commercial purpose is strictly prohibited.

# Python Basics

The Python programming language has a wide range of syntactical constructions, standard library functions, and interactive development environment features. Fortunately, you can ignore most of that; you just need to learn enough to write some handy little programs.
You will, however, have to learn some basic programming concepts before you can do anything. Like a wizard-in-training, you might think these concepts seem arcane and tedious, but with some knowledge and practice, you’ll be able to command your computer like a magic wand to perform incredible feats.

This chapter has a few examples that encourage you to type into the cells, which lets you execute Python instructions one at a time and shows you the results instantly. Using the jupyter notebooks is great for learning what basic Python instructions do, so give it a try as you follow along. You’ll remember the things you do much better than the things you only read.

In [1]:
2+2

4

In Python, 2 + 2 is called an expression, which is the most basic kind of programming instruction in the language. Expressions consist of values (such as 2) and operators (such as +), and they can always evaluate (that is, reduce) down to a single value. That means you can use expressions anywhere in Python code that you could also use a value.

In the previous example, 2 + 2 is evaluated down to a single value, 4. A single value with no operators is also considered an expression, though it evaluates only to itself, as shown here:

In [2]:
2

2

### Errors are Okay!

Programs will crash if they contain code the computer can’t understand, which will cause Python to show an error message. An error message won’t break your computer, though, so don’t be afraid to make mistakes. A crash just means the program stopped running unexpectedly.

If you want to know more about an error message, you can search for the exact message text online to find out more about that specific error.

There are plenty of other operators you can use in Python expressions, too. For example, below Table lists all the math operators in Python.

<div class="book">
<table summary="Math Operators from Highest to Lowest Precedence" class="calibre9">
<colgroup class="calibre10">
<col class="calibre11">
<col class="calibre11">
<col class="calibre11">
<col class="calibre11">
</colgroup>
<thead class="calibre12">
<tr class="calibre13">
<th valign="top" class="calibre14">
<p class="calibre4">Operator</p>
</th>
<th valign="top" class="calibre14">
<p class="calibre4">Operation</p>
</th>
<th valign="top" class="calibre14">
<p class="calibre4">Example</p>
</th>
<th valign="top" class="calibre15">
<p class="calibre4">Evaluates to...</p>
</th>
</tr>
</thead>
<tbody class="calibre16">
<tr class="calibre13">
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">**</code></p>
</td>
<td valign="top" class="calibre17">
<p class="calibre4">Exponent</p>
</td>
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">2 ** 3</code></p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">8</code></p>
</td>
</tr>
<tr class="calibre19">
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">%</code></p>
</td>
<td valign="top" class="calibre17">
<p class="calibre4">Modulus/remainder</p>
</td>
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">22 % 8</code></p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">6</code></p>
</td>
</tr>
<tr class="calibre13">
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">//</code></p>
</td>
<td valign="top" class="calibre17">
<p class="calibre4">Integer division/floored quotient</p>
</td>
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">22 // 8</code></p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">2</code></p>
</td>
</tr>
<tr class="calibre19">
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">/</code></p>
</td>
<td valign="top" class="calibre17">
<p class="calibre4">Division</p>
</td>
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">22 / 8</code></p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">2.75</code></p>
</td>
</tr>
<tr class="calibre13">
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">*</code></p>
</td>
<td valign="top" class="calibre17">
<p class="calibre4">Multiplication</p>
</td>
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">3 * 5</code></p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">15</code></p>
</td>
</tr>
<tr class="calibre19">
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">-</code></p>
</td>
<td valign="top" class="calibre17">
<p class="calibre4">Subtraction</p>
</td>
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">5 - 2</code></p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">3</code></p>
</td>
</tr>
<tr class="calibre13">
<td valign="top" class="calibre20">
<p class="calibre4"><code class="literal2">+</code></p>
</td>
<td valign="top" class="calibre20">
<p class="calibre4">Addition</p>
</td>
<td valign="top" class="calibre20">
<p class="calibre4"><code class="literal2">2 + 2</code></p>
</td>
<td valign="top" class="calibre21">
<p class="calibre4"><code class="literal2">4</code></p>
</td>
</tr>
</tbody>
</table>
</div>

The order of operations (also called precedence) of Python math operators is similar to that of mathematics. The ** operator is evaluated first; the *, /, //, and % operators are evaluated next, from left to right; and the + and - operators are evaluated last (also from left to right). You can use parentheses to override the usual precedence if you need to. Enter the following expressions into the cell below:

In [4]:
2 + 3 * 6

20

In [5]:
(2 + 3) * 6

30

In [6]:
48565878 * 578453

28093077826734

In [7]:
2 ** 8

256

In [8]:
23 / 7

3.2857142857142856

In [9]:
23 // 7

3

In [10]:
23 % 7

2

In [11]:
2     +            2

4

In [12]:
(5 - 1) * ((7 + 1) / (3 - 1))

16.0

In each case, you as the programmer must enter the expression, but Python does the hard part of evaluating it down to a single value. Python will keep evaluating parts of the expression until it becomes a single value, as shown in figure below

![](images/expression.png)

These rules for putting operators and values together to form expressions are a fundamental part of Python as a programming language, just like the grammar rules that help us communicate. Here’s an example:

- This is a grammatically correct English sentence.

- This grammatically is sentence not English correct a.

The second line is difficult to parse because it doesn’t follow the rules of English. Similarly, if you type in a bad Python instruction, Python won’t be able to understand it and will display a SyntaxError error message, as shown here:

In [13]:
5 +

SyntaxError: invalid syntax (<ipython-input-13-4f4744a157be>, line 1)

In [14]:
42 + 5 + * 2

SyntaxError: invalid syntax (<ipython-input-14-16e07b76f178>, line 1)

You can always test to see whether an instruction works by typing it into the cell. Don’t worry about breaking the computer: The worst thing that could happen is that Python responds with an error message. Professional software developers get error messages while writing code all the time.

## The Integer, Floating-Point, and String Data Types

Remember that expressions are just values combined with operators, and they always evaluate down to a single value. A data type is a category for values, and every value belongs to exactly one data type. The most common data types in Python are listed in Table 1-2. The values -2 and 30, for example, are said to be integer values. The integer (or int) data type indicates values that are whole numbers. Numbers with a decimal point, such as 3.14, are called floating-point numbers (or floats). Note that even though the value 42 is an integer, the value 42.0 would be a floating-point number.

<div class="table"><a id="calibre_link-103" class="calibre1"></a>
<div class="book">
<table summary="Common Data Types" class="calibre9">
<colgroup class="calibre10">
<col class="calibre11">
<col class="calibre11">
</colgroup>
<thead class="calibre12">
<tr class="calibre13">
<th valign="top" class="calibre14">
<p class="calibre4">Data type</p>
</th>
<th valign="top" class="calibre15">
<p class="calibre4">Examples</p>
</th>
</tr>
</thead>
<tbody class="calibre16">
<tr class="calibre13">
<td valign="top" class="calibre17">
<p class="calibre4">Integers</p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">-2</code>, <code class="literal2">-1</code>, <code class="literal2">0</code>, <code class="literal2">1</code>, <code class="literal2">2</code>, <code class="literal2">3</code>, <code class="literal2">4</code>, <code class="literal2">5</code></p>
</td>
</tr>
<tr class="calibre19">
<td valign="top" class="calibre17">
<p class="calibre4">Floating-point numbers</p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">-1.25</code>, <code class="literal2">-1.0</code>, -<code class="literal2">-0.5</code>, <code class="literal2">0.0</code>, <code class="literal2">0.5</code>, <code class="literal2">1.0</code>, <code class="literal2">1.25</code></p>
</td>
</tr>
<tr class="calibre13">
<td valign="top" class="calibre20">
<p class="calibre4">Strings</p>
</td>
<td valign="top" class="calibre21">
<p class="calibre4"><code class="literal2">'a'</code>, <code class="literal2">'aa'</code>, <code class="literal2">'aaa'</code>, <code class="literal2">'Hello!'</code>, <code class="literal2">'11 cats'</code></p>
</td>
</tr>
</tbody>
</table>
</div>
</div>

Python programs can also have text values called strings, or strs (pronounced “stirs”). Always surround your string in single quote (') characters (as in 'Hello' or 'Goodbye cruel world!') so Python knows where the string begins and ends. You can even have a string with no characters in it, '', called a blank string.

If you ever see the error message SyntaxError: EOL while scanning string literal, you probably forgot the final single quote character at the end of the string, such as in this example:

In [15]:
'Hello world!

SyntaxError: EOL while scanning string literal (<ipython-input-15-b0711bd32433>, line 1)

## String Concatenation and Replication

The meaning of an operator may change based on the data types of the values next to it. For example, + is the addition operator when it operates on two integers or floating-point values. However, when + is used on two string values, it joins the strings as the string concatenation operator. Enter the following into the cell 

In [16]:
'Alice' + 'Bob'

'AliceBob'

The expression evaluates down to a single, new string value that combines the text of the two strings. However, if you try to use the + operator on a string and an integer value, Python will not know how to handle this, and it will display an error message.

In [17]:
'Alice' + 42

TypeError: must be str, not int

The error message Can't convert 'int' object to str implicitly means that Python thought you were trying to concatenate an integer to the string 'Alice'. Your code will have to explicitly convert the integer to a string, because Python cannot do this automatically. 

The * operator is used for multiplication when it operates on two integer or floating-point values. But when the * operator is used on one string value and one integer value, it becomes the string replication operator. Enter a string multiplied by a number into the interactive shell to see this in action.

In [18]:
'Alice' * 5

'AliceAliceAliceAliceAlice'

The expression evaluates down to a single string value that repeats the original a number of times equal to the integer value. String replication is a useful trick, but it’s not used as often as string concatenation.

The * operator can be used with only two numeric values (for multiplication) or one string value and one integer value (for string replication). Otherwise, Python will just display an error message.

In [19]:
'Alice' * 'Bob'

TypeError: can't multiply sequence by non-int of type 'str'

In [21]:
'Alice' * 5.0

TypeError: can't multiply sequence by non-int of type 'float'

It makes sense that Python wouldn’t understand these expressions: You can’t multiply two words, and it’s hard to replicate an arbitrary string a fractional number of times.

# Storing Values in Variables
A variable is like a box in the computer’s memory where you can store a single value. If you want to use the result of an evaluated expression later in your program, you can save it inside a variable.

### Assignment Statements
You’ll store values in variables with an assignment statement. An assignment statement consists of a variable name, an equal sign (called the assignment operator), and the value to be stored. If you enter the assignment statement spam = 42, then a variable named spam will have the integer value 42 stored in it.

Think of a variable as a labeled box that a value is placed in, as in the figure.

![](images/000060.jpg)

In [22]:
spam = 40 # (1)

In [23]:
spam

40

In [24]:
eggs = 2

In [29]:
spam + eggs # (2)

44

In [26]:
spam + eggs + spam

82

In [30]:
spam = spam + 2 # (3)

In [28]:
spam

42

A variable is initialized (or created) the first time a value is stored in it ❶. After that, you can use it in expressions with other variables and values ❷. When a variable is assigned a new value ③, the old value is forgotten, which is why spam evaluated to 42 instead of 40 at the end of the example. This is called overwriting the variable. Enter the following code into the interactive shell to try overwriting a string:

In [31]:
spam = 'Hello'

In [32]:
spam

'Hello'

In [33]:
spam = 'Goodbye'

In [34]:
spam

'Goodbye'

the spam variable in this example stores 'Hello' until you replace it with 'Goodbye'.

![](images/000064.jpg)

# Variable Names
Table below has examples of legal variable names. You can name a variable anything as long as it obeys the following three rules:

- It can be only one word.

- It can use only letters, numbers, and the underscore (_) character.

- It can’t begin with a number.

<div class="book">
<table summary="Valid and Invalid Variable Names" class="calibre9">
<colgroup class="calibre10">
<col class="calibre11">
<col class="calibre11">
</colgroup>
<thead class="calibre12">
<tr class="calibre13">
<th valign="top" class="calibre14">
<p class="calibre4">Valid variable names</p>
</th>
<th valign="top" class="calibre15">
<p class="calibre4">Invalid variable names</p>
</th>
</tr>
</thead>
<tbody class="calibre16">
<tr class="calibre13">
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">balance</code></p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">current-balance</code> (hyphens are not allowed)</p>
</td>
</tr>
<tr class="calibre19">
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">currentBalance</code></p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">current balance</code> (spaces are not allowed)</p>
</td>
</tr>
<tr class="calibre13">
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">current_balance</code></p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">4account</code> (can’t begin with a number)</p>
</td>
</tr>
<tr class="calibre19">
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">_spam</code></p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">42</code> (can’t begin with a number)</p>
</td>
</tr>
<tr class="calibre13">
<td valign="top" class="calibre17">
<p class="calibre4"><code class="literal2">SPAM</code></p>
</td>
<td valign="top" class="calibre18">
<p class="calibre4"><code class="literal2">total_$um</code> (special characters like <code class="literal2">$</code> are not allowed)</p>
</td>
</tr>
<tr class="calibre19">
<td valign="top" class="calibre20">
<p class="calibre4"><code class="literal2">account4</code></p>
</td>
<td valign="top" class="calibre21">
<p class="calibre4"><code class="literal2">'hello'</code> (special characters like <code class="literal2">'</code> are not allowed)</p>
</td>
</tr>
</tbody>
</table>
</div>

Variable names are case-sensitive, meaning that spam, SPAM, Spam, and sPaM are four different variables. It is a Python convention to start your variables with a lowercase letter.

A good variable name describes the data it contains. Imagine that you moved to a new house and labeled all of your moving boxes as Stuff. You’d never find anything! The variable names spam, eggs, and bacon are used as generic names for the examples, but in your programs, a descriptive name will help make your code more readable.

## Your first program

In [None]:
# This program says hello and asks for my name.

print('Hello world!')
print('What is your name?')    # ask for their name
myName = input()
print('It is good to meet you, ' + myName)
print('The length of your name is:')
print(len(myName))
print('What is your age?')    # ask for their age
myAge = input()
print('You will be ' + str(int(myAge) + 1) + ' in a year.')

### Dissecting Your Program
 Let’s take a quick look at what each line of code does.

#### Comments

The following line is called a comment.


    # This program says hello and asks for my name.
Python ignores comments, and you can use them to write notes or remind yourself what the code is trying to do. Any text for the rest of the line following a hash mark (#) is part of a comment.

Sometimes, programmers will put a # in front of a line of code to temporarily remove it while testing a program. This is called commenting out code, and it can be useful when you’re trying to figure out why a program doesn’t work. You can remove the # later when you are ready to put the line back in.

Python also ignores the blank line after the comment. You can add as many blank lines to your program as you want. This can make your code easier to read, like paragraphs in a book.

#### The print() Function
The print() function displays the string value inside the parentheses on the screen.


     print('Hello world!')
     print('What is your name?') # ask for their name
The line print('Hello world!') means “Print out the text in the string 'Hello world!'.” When Python executes this line, you say that Python is calling the print() function and the string value is being passed to the function. A value that is passed to a function call is an argument. Notice that the quotes are not printed to the screen. They just mark where the string begins and ends; they are not part of the string value.

Note
You can also use this function to put a blank line on the screen; just call print() with nothing in between the parentheses.

#### The input() Function
The input() function waits for the user to type some text on the keyboard and press ENTER.


    myName = input()
This function call evaluates to a string equal to the user’s text, and the previous line of code assigns the myName variable to this string value.

You can think of the input() function call as an expression that evaluates to whatever string the user typed in. If the user entered 'Ai', then the expression would evaluate to myName = 'Ai'.

#### Printing the User’s Name
The following call to print() actually contains the expression 'It is good to meet you, ' + myName between the parentheses.

    print('It is good to meet you, ' + myName)
Remember that expressions can always evaluate to a single value. If 'Al' is the value stored in myName on the previous line, then this expression evaluates to 'It is good to meet you, Ai'. This single string value is then passed to print(), which prints it on the screen.

#### The len() function

You can pass the len() function a string value (or a variable containing a string), and the function evaluates to the integer value of the number of characters in that string.

    print('The length of your name is:')
    print(len(myName))

In [3]:
len('hello')

5

In [4]:
len('My very energetic monster just scarfed nachos.')

46

In [5]:
len('')

0

Just like those examples, len(myName) evaluates to an integer. It is then passed to print() to be displayed on the screen. Notice that print() allows you to pass it either integer values or string values. But notice the error that shows up when you type the following into the cell:

In [6]:
print('I am ' + 29 + ' years old.')

TypeError: must be str, not int

The print() function isn’t causing that error, but rather it’s the expression you tried to pass to print(). You get the same error message if you type the expression into the cell on its own.

In [7]:
'I am ' + 29 + ' years old.'

TypeError: must be str, not int

Python gives an error because you can use the + operator only to add two integers together or concatenate two strings. You can’t add an integer to a string because this is ungrammatical in Python. You can fix this by using a string version of the integer instead, as explained in the next section.

#### The str(), int(), and float() Functions
If you want to concatenate an integer such as 29 with a string to pass to print(), you’ll need to get the value '29', which is the string form of 29. The str() function can be passed an integer value and will evaluate to a string value version of it, as follows:

In [8]:
str(29)

'29'

In [9]:
print('I am ' + str(29) + ' years old.')

I am 29 years old.


Because str(29) evaluates to '29', the expression 'I am ' + str(29) + ' years old.' evaluates to 'I am ' + '29' + ' years old.', which in turn evaluates to 'I am 29 years old.'. This is the value that is passed to the print() function.

The str(), int(), and float() functions will evaluate to the string, integer, and floating-point forms of the value you pass, respectively. Try converting some values with these functions, and watch what happens

In [10]:
str(0), str(-3.14), int('42'), int('-99'), int(1.25), int(1.99), float('3.14'), float(10)

('0', '-3.14', 42, -99, 1, 1, 3.14, 10.0)

The previous examples call the str(), int(), and float() functions and pass them values of the other data types to obtain a string, integer, or floating-point form of those values.

The str() function is handy when you have an integer or float that you want to concatenate to a string. The int() function is also helpful if you have a number as a string value that you want to use in some mathematics. 

#### Note:
The input() function always returns a string, even if the user enters a number. Enter spam = input() into the cell and enter 101 when it waits for your text.

In [11]:
spam = input()

 101


In [12]:
spam

'101'

The value stored inside spam isn’t the integer 101 but the string '101'. If you want to do math using the value in spam, use the int() function to get the integer form of spam and then store this as the new value in spam.

In [14]:
spam = int(spam)
spam

101

Now you should be able to treat the spam variable as an integer instead of a string.

In [15]:
spam *10/5

202.0

#### Note: 
If you pass a value to int() that it cannot evaluate as an integer, Python will display an error message.

In [16]:
int('99.99')

ValueError: invalid literal for int() with base 10: '99.99'

In [17]:
int('twelve')

ValueError: invalid literal for int() with base 10: 'twelve'

In your program, you used the int() and str() functions in the last three lines to get a value of the appropriate data type for the code.

    print('What is your age?') # ask for their age
    myAge = input()
    print('You will be ' + str(int(myAge) + 1) + ' in a year.')
The myAge variable contains the value returned from input(). Because the input() function always returns a string (even if the user typed in a number), you can use the int(myAge) code to return an integer value of the string in myAge. This integer value is then added to 1 in the expression int(myAge) + 1.

The result of this addition is passed to the str() function: str(int(myAge) + 1). The string value returned is then concatenated with the strings 'You will be ' and ' in a year.' to evaluate to one large string value. This large string is finally passed to print() to be displayed on the screen.

Let’s say the user enters the string '4' for myAge. The string '4' is converted to an integer, so you can add one to it. The result is 5. The str() function converts the result back to a string, so you can concatenate it with the second string, 'in a year.', to create the final message. These evaluation steps would look something like this:

![](images/000069.png)

#### Text and Number Equivalence

Although the string value of a number is considered a completely different value from the integer or floating-point version, an integer can be equal to a floating point.

In [18]:
42 == '42'

False

In [19]:
42 == 42.0

True

In [20]:
42.0 == 0042.000

True

Python makes this distinction because strings are text, while integers and floats are both numbers.



## Summary
You can compute expressions with a calculator or type string concatenations with a word processor. You can even do string replication easily by copying and pasting text. But expressions, and their component values—operators, variables, and function calls—are the basic building blocks that make programs. Once you know how to handle these elements, you will be able to instruct Python to operate on large amounts of data for you.

It is good to remember the different types of operators (+, -, *, /, //, %, and ** for math operations, and + and * for string operations) and the three data types (integers, floating-point numbers, and strings) introduced in this chapter.

A few different functions were introduced as well. The print() and input() functions handle simple text output (to the screen) and input (from the keyboard). The len() function takes a string and evaluates to an int of the number of characters in the string. The str(), int(), and float() functions will evaluate to the string, integer, or floating-point number form of the value they are passed.

In the next chapter, you will learn how to tell Python to make intelligent decisions about what code to run, what code to skip, and what code to repeat based on the values it has. This is known as **flow control**, and it allows you to write programs that make intelligent decisions.

### Further Reading: 
Search online for the Python documentation for the **len()** function. It will be on a web page titled “Built-in Functions.” Skim the list of other functions Python has, look up what the **round()** function does, and experiment with it in the interactive shell