# Cheat Sheet

### 1. To create a new empty Pandas dataframe with 3 columns:
   
```
data = pd.DataFrame(columns = ['column_name_1', 'column_name_2', 'column_name_3'])
```

Example:   

In [1]:
import pandas as pd
data = pd.DataFrame(columns = ['name', 'birth_date', 'height'])
print(data)

Empty DataFrame
Columns: [name, birth_date, height]
Index: []


---

### 2. To add a row to a Pandas dataframe:

```
data = data.append({'column_name_1': column_value_1, 'column_name_2': column_value_2}, ignore_index=True)
```

Example:

In [2]:
data = data.append({'name': 'John', 'birth_date': '1999-09-09', 'height': 63}, ignore_index=True)
data = data.append({'name': 'Julia', 'birth_date': '1984-11-12', 'height': 71}, ignore_index=True)
data = data.append({'name': 'Jack', 'birth_date': '1973-03-12', 'height': 69}, ignore_index=True)
data = data.append({'name': 'Jack', 'birth_date': '2003-12-23', 'height': 73}, ignore_index=True)
print(data)

    name  birth_date height
0   John  1999-09-09     63
1  Julia  1984-11-12     71
2   Jack  1973-03-12     69
3   Jack  2003-12-23     73


---

### 3. To get unique values from a column in Pandas dataframe:

```
unique_values = data['column_name'].unique()
```

Example:

In [3]:
unique_names = data['name'].unique()
print(unique_names)

['John' 'Julia' 'Jack']


---

### 4. To sort a Pandas dataframe by a one or more columns:

```
sorted_data = data['column_name'].sort_values(by=['column_name_1', 'column_name_2'], ascending=False)
```

Example:

In [4]:
sorted_data = data.sort_values(by=['name', 'height'], ascending=True)
print(sorted_data)

    name  birth_date height
2   Jack  1973-03-12     69
3   Jack  2003-12-23     73
0   John  1999-09-09     63
1  Julia  1984-11-12     71


In [5]:
sorted_data = data.sort_values(by=['name', 'height'], ascending=False)
print(sorted_data)

    name  birth_date height
1  Julia  1984-11-12     71
0   John  1999-09-09     63
3   Jack  2003-12-23     73
2   Jack  1973-03-12     69


---

### 5. To get the month from a date column in a Pandas dataframe:

```
data['new_column_name'] = pd.to_datetime(data['date_column_name']).dt.month
```

Example:

In [6]:
data['birth_month'] = pd.to_datetime(data['birth_date']).dt.month
print(data)

    name  birth_date height  birth_month
0   John  1999-09-09     63            9
1  Julia  1984-11-12     71           11
2   Jack  1973-03-12     69            3
3   Jack  2003-12-23     73           12


---

### 6. To get the year from a date column in a Pandas dataframe and store it in a new column:

```
data['new_column_name'] = pd.to_datetime(data['date_column_name']).dt.year
```

Example:

In [7]:
data['birth_year'] = pd.to_datetime(data['birth_date']).dt.year
print(data)

    name  birth_date height  birth_month  birth_year
0   John  1999-09-09     63            9        1999
1  Julia  1984-11-12     71           11        1984
2   Jack  1973-03-12     69            3        1973
3   Jack  2003-12-23     73           12        2003


---

### 7. To create a column based on values of other columns

```
data['new_column_name'] = data['column_name_1'] + data['column_name_2']
```

Example:

In [8]:
data['age'] = 2019 - data['birth_year']
print(data)

    name  birth_date height  birth_month  birth_year  age
0   John  1999-09-09     63            9        1999   20
1  Julia  1984-11-12     71           11        1984   35
2   Jack  1973-03-12     69            3        1973   46
3   Jack  2003-12-23     73           12        2003   16


---

### 8. To filter a dataframe based on one or more condition

```
subdata = data[filter1]
```

```
subdata = data[filter1 & filter2]
```

Example:

In [9]:
filter1 = data['name'] == 'Jack'
subdata = data[filter1]
print(subdata)

   name  birth_date height  birth_month  birth_year  age
2  Jack  1973-03-12     69            3        1973   46
3  Jack  2003-12-23     73           12        2003   16


In [10]:
filter1 = data['name'] == 'Jack'
filter2 = data['age'] < 20
subdata = data[filter1 & filter2]
print(subdata)

   name  birth_date height  birth_month  birth_year  age
3  Jack  2003-12-23     73           12        2003   16


---

### 9. To get the first `n` row from a pandas datafram

```
subdata = data.head(n)
```

Example:

In [11]:
subdata = data.head(2)
print(subdata)

    name  birth_date height  birth_month  birth_year  age
0   John  1999-09-09     63            9        1999   20
1  Julia  1984-11-12     71           11        1984   35


---

### 10. To loop through a list

```
for x in list:
    print(x)
        
```

Example:

In [None]:
list = ['Jack', 'Julia', 'John']
for x in list:
    print(x)

---

### 11. To loop through rows of a Pandas dataframs

```
for index, row in data.iterrows():
    print(index)
        
```

Example:

In [29]:
for index, row in data.iterrows():
    print('The name in row ', index, ' is ', row['name'])

The name in row  0  is  John
The name in row  1  is  Julia
The name in row  2  is  Jack
The name in row  3  is  Jack


---

### 12. To get the length of a list

```
n = len(list)
        
```

Example:

In [13]:
list = ['Jack', 'Julia', 'John']
n = len(list)
print(n)

3


---

### 13. To excute a block of code if a condition is met

```
if codition:
    print(x)
        
```

Example:

In [14]:
hour = 10
if hour < 18:
    print('Good day')

Good day


---

### 14. To round a floating point number to `n` digits after the decimal point:

```
rounded_num = round(number, n)
```

Example:

In [15]:
rounded_num = round(23.45198, 2)
print(rounded_num)

23.45


---

### 15. To create lowercased string from the given string:

```
lowercased_name = string.lower()
```

Example:

In [16]:
string = 'Python Data Analysis Workshop'
lowercased_name = string.lower()
print(lowercased_name)

python data analysis workshop


---

### 16. To create uppercased string from the given string:

```
lowercased_name = string.upper()
```

Example:

In [17]:
string = 'Python Data Analysis Workshop'
uppercased_name = string.upper()
print(uppercased_name)

PYTHON DATA ANALYSIS WORKSHOP


---

### 17. To plot a line graph using matplotlib package with x and y labels

```
plt.plot(list1, list2)
plt.xlabel("x lable")
plt.ylabel("y lable")
```

Example:

In [18]:
import matplotlib.pyplot as plt
list1 = [1, 2, 3, 4, 5]
list2 = [1, 4, 9, 16, 25]
plt.plot(list1, list2)
plt.xlabel("x-axis label")
plt.ylabel("y-axis label")

Text(0, 0.5, 'y-axis label')

---

### 18. To open an image file using PIL package

```
img = PIL.Image.open(image_file)
```

Example:

In [19]:
import PIL
img = PIL.Image.open('test_image.png')

---

### 19. To read text from image using pytesseract package

```
text = pytesseract.image_to_string(img)
```

Example:

In [20]:
import pytesseract
text = pytesseract.image_to_string(img)
print(text)

Python For Data Analysis Workshop
Lighthouse Labs


---

### 20. To split the text read from an image into lines

```
lines = text.split('\n')
```

Example:

In [21]:
lines = text.split('\n')
print(lines)

['Python For Data Analysis Workshop', 'Lighthouse Labs']


---

### 21. To get a substring from a string

```
substring = string[start_index:end_index]
```

Example:

In [22]:
string = 'My name is John'
name = string[10:15]
print(name)

 John


---

### 22. To change a string to a floating point number

```
num = float(string)
```

Example:

In [23]:
string = '34.783'
num = float(string)
print(num)

34.783


---