# Python Tutorial

## Introduction

Python is a great general-purpose programming language on its own, but with the help of a few popular libraries (numpy, scipy, matplotlib) it becomes a powerful environment for scientific computing.

This section will serve as a quick crash course both on the Python programming language

In this tutorial, we will cover:

* Basic Python: Basic data types (Lists, Dictionaries, Sets), Functions, Classes

## Basics of Python

Python is a high-level programming language with almost pseudocode like syntax, since it allows you to express powerful ideas in very few lines of code while being quite readable. As an example, here is the addition of two numbers.

In [1]:
a, b = 2, 3
print(a + b)

5


### Basic data types

#### Numbers

Integers and floats work as you would expect from other languages:

In [4]:
x = 3
print(x, type(x))

3 <class 'int'>


In [7]:
print(x + 1)   # Addition;
print(x - 1)   # Subtraction;
print(x * 2)   # Multiplication;
print(x ** 2)  # Exponentiation;

4
2
6
9


In [8]:
x += 1
print(x)  # Prints "4"
x *= 2
print(x)  # Prints "8"

4
8


In [9]:
y = 2.5
print(type(y)) # Prints "<type 'float'>"
print(y, y + 1, y * 2, y ** 2) # Prints "2.5 3.5 5.0 6.25"

<class 'float'>
2.5 3.5 5.0 6.25


Note that unlike many languages, Python does not have unary increment (x++) or decrement (x--) operators.

Python also has built-in types for long integers and complex numbers; you can find all of the details in the [documentation](https://docs.python.org/2/library/stdtypes.html#numeric-types-int-float-long-complex).

We can also convert one type to another (type-casting) in python, with ease

In [96]:
x, y = 5, 5.5
print(x, type(x), y, type(y))
x, y = float(x), int(y)
print(x, type(x), y, type(y))

5 <class 'int'> 5.5 <class 'float'>
5.0 <class 'float'> 5 <class 'int'>


#### Booleans

Python implements all of the usual operators for Boolean logic, but uses English words rather than symbols (`&&`, `||`, etc.):

In [11]:
t, f = True, False
print(type(t)) # Prints "<type 'bool'>"

<class 'bool'>


Now we let's look at the operations:

In [13]:
print(t and f) # Logical AND;
print(t or f)  # Logical OR;
print(not t)   # Logical NOT;

False
True
False


#### Strings

In [14]:
hello = 'hello'   # String literals can use single quotes
world = "world"   # or double quotes; it does not matter.
print(hello, len(hello))

hello 5


In [15]:
hw = hello + ' ' + world  # String concatenation
print(hw)  # prints "hello world"

hello world


In [16]:
hw12 = '%s %s %d' % (hello, world, 12)  # sprintf style string formatting
print(hw12)  # prints "hello world 12"

hello world 12


String objects have a bunch of useful methods; for example:

In [17]:
s = "hello"
print(s.capitalize())  # Capitalize a string; prints "Hello"
print(s.upper())       # Convert a string to uppercase; prints "HELLO"
print(s.rjust(7))      # Right-justify a string, padding with spaces; prints "  hello"
print(s.center(7))     # Center a string, padding with spaces; prints " hello "
print(s.replace('l', '(ell)'))  # Replace all instances of one substring with another;
                               # prints "he(ell)(ell)o"
print('  world '.strip())  # Strip leading and trailing whitespace; prints "world"

Hello
HELLO
  hello
 hello 
he(ell)(ell)o
world


You can find a list of all string methods in the [documentation](https://docs.python.org/2/library/stdtypes.html#string-methods).

### Containers

Python includes several built-in container types: lists, dictionaries, sets, and tuples.

#### Lists

A list is the Python equivalent of an array, but is resizeable and can contain elements of different types:

In [3]:
xs = [3, 1, 4]   # Create a list
print(xs, xs[2])
print(xs[-2])     # Negative indices count from the end of the list; prints "2"

[3, 1, 4] 4
1


In [4]:
xs[2] = 'foo'    # Lists can contain elements of different types
print(xs)

[3, 1, 'foo']


In [5]:
xs.append('bar') # Add a new element to the end of the list
print(xs)  

[3, 1, 'foo', 'bar']


In [6]:
x = xs.pop()     # Remove and return the last element of the list
print(x, xs) 

bar [3, 1, 'foo']


As usual, you can find all the gory details about lists in the [documentation](https://docs.python.org/2/tutorial/datastructures.html#more-on-lists).

#### Slicing

In addition to accessing list elements one at a time, Python provides concise syntax to access sublists; this is known as slicing:

In [7]:
nums = [1, 2, 3, 4, 5]    # list comprehension
print(nums)         # Prints "[0, 1, 2, 3, 4]"
print(nums[2:4])    # Get a slice from index 2 to 4 (exclusive); prints "[2, 3]"
print(nums[2:])     # Get a slice from index 2 to the end; prints "[2, 3, 4]"
print(nums[:2])     # Get a slice from the start to index 2 (exclusive); prints "[0, 1]"
print(nums[:])      # Get a slice of the whole list; prints ["0, 1, 2, 3, 4]"
print(nums[:-1])    # Slice indices can be negative; prints ["0, 1, 2, 3]"
nums[2:4] = [8, 9] # Assign a new sublist to a slice
print(nums)         # Prints "[0, 1, 8, 9, 4]"

[1, 2, 3, 4, 5]
[3, 4]
[3, 4, 5]
[1, 2]
[1, 2, 3, 4, 5]
[1, 2, 3, 4]
[1, 2, 8, 9, 5]


#### Loops

You can loop over the elements of a list like this:

In [8]:
animals = ['cat', 'dog', 'monkey']
for i in animals:
    print(i)

cat
dog
monkey


If you want access to the index of each element within the body of a loop, use the built-in `enumerate` function:

In [26]:
animals = ['cat', 'dog', 'monkey']
for idx, animal in enumerate(animals):
    print('#%d: %s' % (idx + 1, animal))

#1: cat
#2: dog
#3: monkey


#### Dictionaries

A dictionary stores (key, value) pairs. It is a hashtable

In [17]:
d = {'cat': 'cute', 'dog': 'furry'}  # Create a new dictionary with some data
print(d['cat'])       # Get an entry from a dictionary; prints "cute"
print('cat' in d)     # Check if a dictionary has a given key; prints "True"
d['cat'] = "Dog"
print(d['cat'])
d['cat'] = ['cute', 'dog']
print(d['cat'])

cute
True
Dog
['cute', 'dog']


In [18]:
d['fish'] = 'wet'    # Set an entry in a dictionary
print(d['fish'])      # Prints "wet"

wet


In [20]:
del d['fish']        # Remove an element from a dictionary
#print(d[fish])
len(d)

2

You can find all you need to know about dictionaries in the [documentation](https://docs.python.org/2/library/stdtypes.html#dict).

It is easy to iterate over the keys in a dictionary:

In [22]:
d = {'person': 2, 'cat': 4, 'spider': 8}
for i, j in d.items():
    #print('A %s has %d legs' % (i, j))
    print('A '+i+" has "+str(j)+" legs")

i, j = 'person', 2

A person has 2 legs
A cat has 4 legs
A spider has 8 legs


#### Sets

A set is an unordered collection of distinct elements. As a simple example, consider the following:

In [23]:
animals = {'cat', 'dog'}
print('cat' in animals)   # Check if an element is in a set; prints "True"
print('fish' in animals)  # prints "False"


True
False


In [24]:
animals.add('fish')      # Add an element to a set
print('fish' in animals)
print(len(animals)) # Number of elements in a set;
print(animals)

True
3
{'cat', 'fish', 'dog'}


In [37]:
animals.add('cat')       # Adding an element that is already in the set does nothing
print(len(animals))       
animals.remove('cat')    # Remove an element from a set
print(len(animals))       

3
2


### Functions

Python functions are defined using the `def` keyword. For example:

In [26]:
def sign(x):
    if x > 0:
        return 'positive'
    elif x < 0:
        return 'negative'
    else:
        return 'zero'

#ls = [-1, 0, 1]
for x in ['-1', '0', '1']:
    print(sign(x))

TypeError: '>' not supported between instances of 'str' and 'int'

We will often define functions to take optional keyword arguments, like this:

In [44]:
def hello(name, loud=False):
    if loud:
        print('HELLO, %s' % name.upper())
    else:
        print('Hello, %s!' % name)

hello('Bob')
hello('Fred', loud=True)

Hello, Bob!
HELLO, FRED


### Classes

The syntax for defining classes in Python is straightforward:

In [28]:
class Greeter:
    # Constructor
    def __init__(self, name):
        self.name = name  # Create an instance variable

    # Instance method
    def greet(self, loud=False):
        if loud:
            print('HELLO, %s!' % self.name.upper())
        else:
            print('Hello, %s' % self.name)

g = Greeter('Fred')  # Construct an instance of the Greeter class
g.greet()            # Call an instance method; prints "Hello, Fred"
g.greet(loud=True)   # Call an instance method; prints "HELLO, FRED!"

Hello, Fred
HELLO, FRED!


The first argument of every class method, including init, is always a reference to the current instance of the class. By convention, this argument is always named self. In the init method, self refers to the newly created object; in other class methods, it refers to the instance whose method was called. For example the below code is the same as the above code.

### File Handling

Basic file reading and writing can be easily performed in python

In [4]:
# Open the file
fp = open("../data/deduplicate_music.csv", encoding='utf-8')

# Example of a while loop
while True:
    # Reads one line at a time
    line = fp.readline()
    # If the current line is not a EOF character
    if line:
        print(line.strip())
        break
    else:
        break

# Close the file
fp.close()

artist_name,disc_title,genre_title,disc_released,disc_tracks,disc_seconds,disc_language


You can also read all the lines at the same time. Each line becomes an element in a list

In [3]:
# Open the file
fp = open("../data/deduplicate_music.csv", encoding='utf-8')
lines = fp.readlines()
# Close the file
fp.close()

print(lines[:10])

['artist_name,disc_title,genre_title,disc_released,disc_tracks,disc_seconds,disc_language\n', 'Various,100+Dj (Cd3-Disco Funk),Disco,2009,13,4646,eng\n', 'Jule Neigel Band,Sphinx,Rock,0,4,1013,\n', 'The Bloodhound Gang,The Ballad Of Chasey Lain,Geffen Records,2000,6,1823,\n', "Various,Let'S Go Party,Latin,2002,17,4403,\n", 'Valence,Music For Transneptunian Space,Misc,0,5,2201,\n', 'Murk,Unholy Presences,Reggae,0,7,2428,\n', 'Ttatjana,Double Deal,Pop,0,2,447,\n', 'Joe Satriani,Engine Of Creation,Rock,2000,11,3214,\n', 'Northborne,The Pill Ep,Industrial,2006,5,1317,\n']


Now, let's try out deduplicating the above dataset. De-duplication is the removal of duplicate samples from a dataset. This exercise will give us some food for thought on how to optimise code while also sealing the deal with python as our programming language of choice!

Note: Order is not important for this dataset.

In [2]:
fp = open("../data/test.csv", "w")
fp.write("XYZ123,56789,Atif\n")
fp.write("XYZ123,56789,Gourab\n")
fp.close()