## PythonCheatSheet

Data Science Road map

1. Basic Tools: Like python, R or SQL. You do not need to know everything. What you only need is to learn how to use **python**
1. Basic Statistics: Like mean, median or standart deviation. If you know basic statistics, you can use **python** easily. 
1. Data Munging: Working with messy and difficult data. Like a inconsistent date and string formatting. As you guess, **python** helps us.
1. Data Visualization: Title is actually explanatory. We will visualize the data with **python** like matplot and seaborn libraries.
1. Machine Learning: You do not need to understand math behind the machine learning technique. You only need is understanding basics of machine learning and learning how to implement it while using **python**.

### As a summary we will learn python to be data scientist !!!

**Content:**
1. [Introduction to Python:](#0)
    1. [Numbers and operations](#1)
    1. [String manipulation](#2)
    1. [Lists Collections](#3)
    1. [Tuples Collections](#4)
    1. [Dictionaries ](#5)
    1. [Logic, control flow and filtering](#6)
    1. [Loop data structures](#7)
1. [Python Data Science Toolbox:]
    1. [Json](#8)
    1. [File Handling](#9)
    1. [User defined function](#10)
    1. [Default and flexible arguments](#11)
    1. [Lambda function](#12)
    1. [DateTime module](#13)
    1. [NumPy](#14)
    1. [Numpy Operations](#15)
    1. [Aggregate Function](#16)
    1. [Pandas](#17)
    1. [Missing Data Handling](#18)
    1.[Pandas File handling ](#19)
    1. [Matplotlib](#20)
    1. [Error handling](#21)


To run a cell, press Shift+Enter or click Run at the top of the page.

Python uses indented space to indicate the level of statements. The following cell is an example
where ’if’ and ’else’ are in same level, while ’print’ is separated by space to a different level.
Spacing should be the same for items that are on the same level.


<a id="0"></a> <br>
## **Introduction to Python**

<a id="1"></a> <br>
### Numbers and operations
Matplot is a python library that help us to plot data. The easiest and most basic plots are line, scatter and histogram plots.
* Line plot is better when x axis is time.
* Scatter is better when there is correlation between two variables
* Histogram is better when we need to see distribution of numerical data.
* Customization: Colors,labels,thickness of line, title, opacity, grid, figsize, ticks of axis and linestyle  

In [1]:
# Number examples
a = 4 + 18
print("Sum of int numbers: {} and number format is {}".format(a, type(a)))
b = 5 + 2.3
print ("Sum of int and {} and number format is {}".format(b, type(b)))
c = 5 + 2.3j
print ("Sum of int and {} and number format is {}".format(c, type(c)))
d = 45.5 - 2.3
print ("Subtraction of int and {} and number format is {}".format(d, type(d)))
e=48 * 2.3
print ("Multiplication of int and {} and number format is {}".format(e, type(e)))
f=48 / 2.3
print ("Division of int and {} and number format is {}".format(f, type(f)))
g=48 // 2.3
print ("Division of int and {} and number format is {}".format(g, type(g)))
h=48 % 2.3
print ("Modulus of int and {} and number format is {}".format(h, type(h)))
i=48 ** 2.3
print ("Exponent of int and {} and number format is {}".format(i, type(i)))
j=abs(-48)
print ("Absolute value of int and {} and number format is {}".format(j, type(j)))


<a id="2"></a> <br>
### String manipulation

In [2]:
# Store strings in a variable
words = "hello world to everyone"
# Print the words value
print(words)
# Use [] to access the character of the string. The first character is indicated by '0'.
print(words[0])
# Use the len() function to find the length of the string
print(len(words))
# Some examples of finding in strings
print(words.count('l')) # Count number of times l repeats in the string
print(words.find("o")) # Find letter 'o' in the string. Returns the position of first match.
print(words.count(' ')) # Count number of spaces in the string
print(words.upper()) # Change the string to uppercase
print(words.lower()) # Change the string to lowercase
print(words.replace("everyone","you")) # Replace word "everyone" with "you"
print(words.title()) # Change string to title format
print(words + "!!!") # Concatenate strings
print(" ".join(words)) # Add ":" between each character
print("".join(reversed(words))) # Reverse the string

<a id="3"></a> <br>
### Lists Collections

In [3]:
# A Python list is similar to an array. You can create an empty list too.
list = []
list1 = [1,2,3,4,5]
list2 =["a","b","c","d","e",1,2,3,4,5,True,False,23.5,1+2j]

In [4]:
# Nest multiple lists
nested_list = [list1,list2]
nested_list

In [5]:
# Combine multiple lists
combined_list = list1 + list2
combined_list

In [6]:
# you can slice a list
combined_list[1:10] # Return list from index 1 to 10    

In [7]:
# append a new item to the list
combined_list.append(10)
combined_list

In [8]:
# remove an last item from the list. returns the removed item
combined_list.pop()

In [9]:
#iterate through a list
for item in combined_list:
    print(item)

In [10]:
# Check if an item is in the list
if "a" in list2:
    print("a is in the list")

In [11]:
# Check if an item is not in the list
if "a" not in list2:
    print("a is not in the list")

<a id="4"></a> <br>
### Tuples Collections

In [12]:
# tuple is similar to list but it is immutable.
tuple = (1,2,3,4,5)
tuple

In [13]:
# slice a tuple
tuple[1:10]

In [14]:
# iterate through a tuple
for item in tuple:
    print(item)
    

<a id="5"></a> <br>
### Dictionaries

In [15]:
# Dictionary is a collection of key-value pairs.
# You can create an empty dictionary too.
dict = {}
dict1 = {"a":1,"b":2,"c":3,"d":4,"e":5}
dict1

In [16]:
# iterate through a dictionary
for key, value in dict1.items():
    print(key, value)
    

In [17]:
# Check if a key is in the dictionary
if "a" in dict1:
    print("a key is in the dictionary")


In [18]:
# get the values of dictionary
dict1.values()

In [19]:
# get the keys of dictionary
dict1.keys()

In [20]:
# get the value of a key
dict1["a"]

In [21]:
# get the items of dictionary
dict1.items()

In [22]:
# get the length of dictionary
len(dict1)

In [23]:
# add a new key-value pair to the dictionary
dict1["f"] = 6
dict1

In [24]:
# remove a key-value pair from the dictionary
del dict1["f"]
dict1

In [25]:
# indexing a dictionary
dict1["a"]


In [26]:
# Create a list of dictionaries
list_of_dicts = [{"a":1},{"b":2},{"c":3},{"d":4},{"e":5}]
list_of_dicts


In [27]:
# Create a dictionary of lists
dict_of_lists = {"a":[1,2,3],"b":[4,5,6],"c":[7,8,9]}
dict_of_lists

In [28]:
print(type(data))

<a id="6"></a> <br>
### Logic, control flow and filtering

In [None]:
# if ,elif,else condition
# if condition is true then execute the code ,eli condition is true then execute the code ,else execute the code
# python relies on indentation to define blocks,scope of code.

In [None]:
x=10
y=20
z=30

if x<y:
    print("x is less than y")
elif x==y:
    print("x is equal to y")
    
elif z>y:
    print("z is greater than y")


<a id="7"></a> <br>
###  LOOPS

In [None]:
#for loop, while loop, break, continue, pass. 
#for loop is used to iterate over a sequence of items,such as a string, a list, a tuple, a dictionary, or a set.
#while loop is used to iterate over a sequence of items,such as a string, a list, a tuple, a dictionary, or a set.
#break is used to exit a for or while loop.
#continue is used to skip the rest of the current iteration and continue with the next.
#pass is a null operation, which does nothing.
#The break statement terminates the current loop.
#The continue statement skips the rest of the current iteration, and jumps to the next iteration.

In [None]:
i = 1
while i < 6:
    print(i)
    if i == 3:
        break
    i += 1

In [None]:
i = 1
while i < 6:
  i += 1
  print("count is:",str(i))
  if i == 3:
    continue
  print(i)
else:
  print("i is no longer less than 6")

In [None]:
x = 0
while x < 5:
    x += 1
    if x == 2:
        continue
    print(x)
else:
    print("x is no longer less than 5")

In [None]:
# Sample for loop examples
fruits = ["orange", "banana", "apple", "grape", "cherry"]
for fruit in fruits:
    print(fruit)
    print("-"*10)
    

# Iterating range
for x in range(1, 10, 2):
    print(x)
else:
    print("task complete")
    print("="*10)
    print("\n")
# Iterating multiple lists
traffic_lights = ["red", "yellow", "green"]
action = ["stop", "slow down", "go"]
for light in traffic_lights:
    for task in action:
        print(light, task)

<a id="8"></a> <br>
###  JSON


In [None]:
# json is a data format that is used to store and transmit data. 
# Json is text written in JavaScript Object Notation.
# It is a lightweight data-interchange format.
# It is easy for humans to read and write.
# It is easy for machines to parse and generate.
# It is based on a subset of the JavaScript syntax.
# It uses double-quotes instead of single-quotes.
# It does not require a special character to identify names.
# It does not have a dedicated syntax for numbers, strings,objects, or arrays,booleans, or null values.
# It does not have a dedicated syntax for comments.
# Pyton can convert a dictionary to json
# Python can convert a json to dictionary
# Python has a built-in package called json that can be used to work with json data.


In [None]:
import json

# sample json data
json_data = '{"name":"John", "age":30, "city":"New York"}'

json_data
print(type(json_data))

In [None]:
#read json data
data = json.loads(json_data)
data

In [None]:
#print the output , which is similar to dictionary
print("Name: " + data["name"]+"\nAge: " + str(data["age"])+"\nCity: " + data["city"])

<a id="9"></a> <br>
### 9.File Handling

* Python has a built-in function called open() that can be used to open a file.
* The open() function returns a file object, which is an iterable object that represents the text file.
* The open() function takes two arguments:
* The first argument is the name of the file to open.
* The second argument is the mode in which the file is to be opened.
* The mode argument specifies how the file is to be opened.
* The mode argument can be used to open the file in read-only mode, write-only mode, or read and write mode.
* The mode argument can also be used to specify whether the file is to be opened in text or binary mode.
* there are four different modes:
* r - Read - Default value. Opens a file for reading, error if the file does not exist
* w - Write - Opens a file for writing, creates the file if it does not exist
* a - Append - Opens a file for appending, creates the file if it does not exist
* r+ - Read / Write - Opens a file for reading and writing
* in Addition to the above modes, the mode argument can also be used to specify whether the file is to be opened in text or binary mode.
* The text mode is the default mode. In text mode, the file is opened in a way that preserves all characters of the file.
* "t" - Text mode
* In binary mode, the file is opened in a way that preserves only the bytes of the file.
* "b" - Binary mode

In [None]:
# Let's create
!echo "This is a test file with text in it. This is the first line." > test.txt
!echo "This is the second line." >> test.txt
!echo "This is the third line." >> test.txt
!echo "This is the fourth line." >> test.txt
!echo "This is the fifth line." >> test.txt

In [None]:
# read the file
f = open("test.txt", "r") # open the file in read mode, if the file does not exist then it will error
content=f.read()   # read the all of the content in the file
print(content);print("="*20);print("type of content is:",type(content))
f.close()  # closes the file

In [None]:
#read first 15 characters of the file
f = open("test.txt", "r") # open the file in read mode, if the file does not exist then it will error
content=f.read(15)   # read first 15 characters
print(content);print("="*20);print("type of content is:",type(content))
f.close()  # closes the file

In [None]:
#Read line from the file
f = open("test.txt", "r")
content=f.readline()   # readline reads the first line of the file
print(content);print("="*20);print("type of content is:",type(content))
f.close()  # closes the file

In [None]:
# Read lines from the file
f = open("test.txt", "r")
content=f.readlines()   # readlines reads the entire file and returns a list of lines
print(content);print("="*20);print("type of content is:",type(content))
f.close()  # closes the file


In [None]:
# Read line by line
f = open("test.txt", "r") # opens the file, if the file does not exist then it will error
for line in f:
    print(line) # reads the line by line from file
f.close()  # closes the file


In [None]:
#create a new file
f = open("test2.txt", "w") # creates the file if it does not exist, otherwise it overwrites the file
f.write("This is the first line.")  # writes the line to the file
f.close()  # closes the file




In [None]:
# Read and write to a file
f = open("test2.txt", "r+") # opens the file for both reading and writing, creates the file if it does not exist, it does not overwrite the file
content=f.read()  # reads the file
print("before \n",content);print("="*20);print("type of content is:",type(content)) # prints the content
f.write("This is the second line.")  # writes the line to the file
f.writelines(["\n This is the third line.", " \n This is the fourth line."])  # writes the lines to the file, the \n is used to create a new line in the file
content=f.read()  # reads the file
print("after \n ",content);print("="*20);print("type of content is:",type(content)) # prints the content

f.close()  # closes the file


In [None]:
#Append the file
f = open("test2.txt", "a") # opens the file for appending, creates the file if it does not exist.it does not overwrite the file
f.write("\n This is the fifth line.")  # writes the line to the file
f.close()  # closes the file


# read to a file
f = open("test2.txt", "r") # opens the file for reading, if the file does not exist then it will error
content=f.read()  # reads the file
print("after append \n",content);print("="*20);print("type of content is:",type(content)) # prints the content
f.close()  # closes the file

<a id="10"></a> <br>
### 10. Functions

* A function is a block of code which only runs when it is called.
* in Python, we do not use parentheses and curly brackets, we use indentation with tabs or spaces
* A function is defined using the **def** keyword.
* You can pass data, known as parameters, into a function.
* A function can have any number of arguments, but only keyword arguments.
* A function can have a return statement to return a value, otherwise, it returns None.
* A function can return data as a result.
* A function can return multiple values as a result.
* A function can take no parameters.
* A function can take parameters.
* A function can take parameters with default values.

In [None]:
#defining a function
def add(a,b): 
    return a+b 

#calling a function
result = add(25,56) 
print("sum of two variable add(25,56) = ",result)

<a id="11"></a> <br>
### 11. Default and flexible arguments

In [None]:
# Sample function with parameters
def Function(name, age):
    print("Hello", name, "you are", age, "years old")

# Calling the function
Function("John", 25)

In [None]:
# sample function with default parameters
def Function(name, age=18):
    print("Hello", name, "you are", age, "years old")

#calling the function
Function("John")

In [None]:
# sample function with variable number of parameters
def Function(*kids):
    print("The youngest child is", kids[2]) # prints the third element of the list

#calling the function
Function("John", "Jim", "Jack", "Mia", "Mary")


In [None]:
# Sample function with variable number of parameters
def Function(**kid):
    print("His last name is", kid["lname"]) # prints the value of the key "lname"

#calling the function
Function(fname="John", lname="Doe", age=18)


<a id="12"></a> <br>
### 12. Lambda Functions

In [None]:
# Sample function with lambda

# Define a lambda function that multiplies argument a with argument b
x=lambda a,b:a*b 
print(x(5,6)) # prints the result of the lambda function

#Define a lambda function that adds argument a with argument b
x=lambda a,b:a+b
print(x(5,6)) # prints the result of the lambda function


<a id="13"></a> <br>
### 13.Datetime

In [None]:
# A datetime module is used to get the current date and time
# A DateTime module in Python can be used to work with dates and times.
# The datetime module provides a number of functions to work with dates and times.

import datetime
current_date = datetime.datetime.now() # returns the current date and time
print(f"current_date : {current_date} \n") # prints the current date and time
print(f"current_date.year : {current_date.year} \n") # prints the current year
print(f"current_date.month : {current_date.month} \n") # prints the current month
print(f"current_date.day : {current_date.day} \n") # prints the current day
print(f"current_date.hour : {current_date.hour} \n") # prints the current hour
print(f"current_date.minute : {current_date.minute} \n") # prints the current minute
print(f"current_date.second : {current_date.second} \n") # prints the current second
print(f"current_date.microsecond : {current_date.microsecond} \n") # prints the current microsecond
print(f"current_date.tzinfo : {current_date.tzinfo} \n") # prints the current timezone
print(f"current_date.timestamp() : {current_date.timestamp()} \n") # prints the current timestamp
print(f"current_date.utcnow() : {current_date.utcnow()} \n") # prints the current utcnow
print(f"current_date.strftime('%Y-%m-%d %H:%M:%S') : {current_date.strftime('%Y-%m-%d %H:%M:%S')} \n") # prints the current date and time in the format specified
print(f"current_date.strftime('%Y-%m-%d %H:%M:%S %p') : {current_date.strftime('%Y-%m-%d %H:%M:%S %p')} \n") # prints the current date and time in the format specified
print(f"current_date.strftime('%Y-%m-%d %H:%M:%S %p %Z') : {current_date.strftime('%Y-%m-%d %H:%M:%S %p %Z')} \n") # prints the current date and time in the format specified
print(f"current_date.strftime('%Y-%m-%d %H:%M:%S %p %Z %z') : {current_date.strftime('%Y-%m-%d %H:%M:%S %p %Z %z')} \n") # prints the current date and time in the format specified
print(f"current_date.strftime('%A %B %d, %Y') : {current_date.strftime('%A %B %d, %Y')} \n") # prints the current date and time in the format specified
print(f"current_date.strftime('%A %B %d, %Y %I:%M %p') : {current_date.strftime('%A %B %d, %Y %I:%M %p')} \n") # prints the current date and time in the format specified
print(f"current_date.strftime('%A %B %d, %Y %I:%M %p %Z') : {current_date.strftime('%A %B %d, %Y %I:%M %p %Z')} \n") # prints the current date and time in the format specified
print(f"current_date.strftime('%A %B %d, %Y %I:%M %p %Z %z') : {current_date.strftime('%A %B %d, %Y %I:%M %p %Z %z')} \n") # prints the current date and time in the format specified
print(f"current_date.strftime('%A %B %d, %Y %I:%M %p %Z %z %A %B %d, %Y %I:%M %p %Z %z') : {current_date.strftime('%A %B %d, %Y %I:%M %p %Z %z %A %B %d, %Y %I:%M %p %Z %z')} \n") # prints the current date and time in the format specified


<a id="14"></a> <br>
### 14. Numpy

* Numpy is a Python package for scientific computing.

* NumPy is a powerful tool for working with large arrays and matrices of numeric data.

* NumPy is a general-purpose library that provides a high-performance multidimensional array object, and tools for working with these arrays.

* Numpy is used to perform mathematical operations on large arrays and matrices.
* Numpy is used to perform statistical operations on large arrays and matrices.
* Numpy is used to perform linear algebra operations on large arrays and matrices.
* Numpy is used to perform image processing operations on large arrays and matrices.
* Numpy is used to perform signal processing operations on large arrays and matrices.
* Numpy is used to perform optimization operations on large arrays and matrices.
* Numpy is used to perform machine learning operations on large arrays and matrices.
* Numpy is used to perform data compression operations on large arrays and matrices.
* Numpy is used to perform fourier transform operations on large arrays and matrices.
* Numpy is used to perform random number generation operations on large arrays and matrices.

In [None]:
# install the NumPy package using pip

#!pip install --upgrade pip
#!pip install --upgrade numpy

In [None]:
# import the NumPy package
import numpy as np


In [None]:
# Create a numpy array
a = np.array([1,2,3,4,5]) # creates a numpy array from a list
print(f"a : {a} \n") # prints the numpy array
b= np.array([[1,2,3],[4,5,6]]) # creates a numpy array from a list of lists
print(f"b : {b} \n") # prints the numpy array
c= np.array([[1,2,3],[4,5,6]], dtype=np.float64) # creates a numpy array from a list of lists and specifies the data type
print(f"c : {c} \n") # prints the numpy array
d= np.zeros((3,4)) # creates a numpy array of zeros with 3 rows and 4 columns
print(f"d : {d} \n") # prints the numpy array
e= np.ones((3,4)) # creates a numpy array of ones with 3 rows and 4 columns
print(f"e : {e} \n") # prints the numpy array
f= np.empty((3,4)) # creates a numpy array of empty with 3 rows and 4 columns
print(f"f : {f} \n") # prints the numpy array
g= np.arange(10,20,2) # creates a numpy array from 10 to 20 with steps of 2
print(f"g : {g} \n") # prints the numpy array
h= np.arange(0,20,2,).reshape(2,5) # creates a numpy array from 10 to 20 with steps of 2 and reshapes it to 2 rows and 5 columns
print(f"h : {h} \n") # prints the numpy array
i= np.linspace(0,2,9).reshape(3,3) # creates a numpy array from 0 to 2 with size of 9 and reshapes it to 3 rows and 3 columns
print(f"i : {i} \n") # prints the numpy array



In [None]:
j= np.random.random((2,2)) # creates a numpy array of random numbers between include 0 and 1 (excluding 1) continuously distributed with 2 rows and 2 columns
print(f"j : {j} \n") # prints the numpy array
p=np.random.random_sample((3,3)) # creates a numpy array of random numbers between include 0 and 1 (excluding 1) continuously distributed with 3 rows and 3 columns
print(f"p : {p} \n") # prints the numpy array
k= np.random.randint(10,20,(3,3)) # creates a numpy array of random integers between 10 and 20 with 3 rows and 3 columns
print(f"k : {k} \n") # prints the numpy array
l= np.random.randint(10,20,(3,3,3),dtype=np.int32) # creates a numpy array of random integers between 10 and 20 with 3 rows, 3 columns and 3 layers and specifies the data type
print(f"l : {l} \n") # prints the numpy array
m= np.random.randint(10,20,(3,3,3),dtype=np.int32).reshape(27) # creates a numpy array of random integers between 10 and 20 with 3 rows, 3 columns and 3 layers and specifies the data type and reshapes it to 27 rows
print(f"m : {m} \n") # prints the numpy array
n=np.random.rand(3,3) # creates a numpy array of random numbers between 0 and 1 uniformly distributed with 3 rows and 3 columns
print(f"n : {n} \n") # prints the numpy array
o=np.random.randn(3,3) # creates a numpy array of random numbers between 0 and 1 normally distributed with 3 rows and 3 columns
print(f"o : {o} \n") # prints the numpy array

In [None]:
# data types
print(f"o.dtype : {o.dtype} \n") # prints the data type of the numpy array

In [None]:
# shape
print(f"o.shape : {o.shape} \n") # prints the shape of the numpy array

In [None]:
# size
print(f"o.size : {o.size} \n") # prints the size of the numpy array

In [None]:
# ndim
print(f"o.ndim : {o.ndim} \n") # prints the number of dimensions of the numpy array

In [None]:
# itemsize
print(f"o.itemsize : {o.itemsize} \n") # prints the size(in bytes) of each element in the numpy array

In [None]:
# nbytes
print(f"o.nbytes : {o.nbytes} \n") # prints the size(in bytes) of the numpy array

In [None]:
# Transpose
print(f"o.T : {o.T} \n") # prints the transpose of the numpy array

In [None]:
# Flatten
print(f"o.flatten() : {o.flatten()} \n") # prints the flattened version of the numpy array

In [None]:
# shape
print(f"o.shape : {o.shape} \n") # prints the shape of the numpy array


In [None]:
# astype
print(f"o : {o} \n") # prints the numpy array
print(f"o.astype(np.int64) : {o.astype(np.int64)} \n") # prints the numpy array with the data type specified

In [None]:
# len
print(f"len(o) : {len(o)} \n") # prints the length of the numpy array

<a id="15"></a> <br>
### 15.Numpy Operations

In [None]:
arr=np.arange(10) # creates a numpy array from 0 to 9
arr1=np.arange(10,20) # creates a numpy array from 10 to 19
print(f"arr : {arr} \n") # prints the numpy array
print(f"arr1 : {arr1} \n") # prints the numpy array

In [None]:
# add two arrays

np.add(arr,arr1) # adds the two arrays

In [None]:
#subtract two arrays
np.subtract(arr,arr1) # subtracts the two arrays

In [None]:
# multiply two arrays
np.multiply(arr,arr1) # multiplies the two arrays, element-wise

In [None]:
# multiply(MATRIX) two arrays
np.dot(arr,arr1) # multiplies the two arrays,

In [None]:
# divide two arrays
np.divide(arr,arr1) # divides the two arrays, element-wise

In [None]:
# compare two arrays
np.equal(arr,arr1) # compares the two arrays, element-wise

In [None]:
#compore two arrays
np.array_equal(arr,arr1) # compares the two arrays

<a id="16"></a> <br>
## 16. Aggregate Function

In [None]:
# sum of an array
np.sum(arr) # sums the elements of the array

In [None]:
# mean of an array
np.mean(arr) # calculates the mean of the array

In [None]:
# median of an array
np.median(arr) # calculates the median of the array

In [None]:
# standard deviation of an array
np.std(arr) # calculates the standard deviation of the array

In [None]:
# min of an array
np.min(arr) # calculates the minimum of the array

In [None]:
# max of an array
np.max(arr) # calculates the maximum of the array

In [None]:
# argmin of an array
np.argmin(arr1) # calculates the index of the minimum of the array

In [None]:
# argmax of an array
np.argmax(arr1) # calculates the index of the maximum of the array

In [None]:
# index of an array
np.where(arr1>5) # returns the indices of the array where the condition is satisfied

In [None]:
# find the index of an array
np.argwhere(arr1>5) # returns the indices of the array where the condition is satisfied

In [None]:
# subset,slicing ,indexing  of an array

arr3=arr[1:5] # returns the elements from arr from index 1 to 4
print(f"arr3 : {arr3} \n") # prints the numpy array
arr4=arr[1:5:2] # returns the elements from arr from index 1 to 4 with step 2
print(f"arr4 : {arr4} \n") # prints the numpy array
arr5=arr[::-1] # returns the elements from arr in reverse order
print(f"arr5 : {arr5} \n") # prints the numpy array
arr6=arr[::-2] # returns the elements from arr in reverse order with step 2
print(f"arr6 : {arr6} \n") # prints the numpy array
arr7=arr[::-3] # returns the elements from arr in reverse order with step 3
print(f"arr7 : {arr7} \n") # prints the numpy array
arr8=arr.reshape((2,5))[0,2:3] # returns the element at row 1 and column 1 of the array
print(f"arr8 : {arr8} \n") # prints the numpy array


In [None]:
# masking of an array
arr2=arr[arr>5] # returns the elements from arr that are greater than 5
print(f"arr2 : {arr2} \n") # prints the numpy array
arr3=arr[(arr>5) & ( arr<8)] # returns the elements from arr that are greater than 5 and less than 8
print(f"arr3 : {arr3} \n") # prints the numpy array
arr4=arr[(arr>5) | (arr<8)] # returns the elements from arr that are greater than 5 or less than 8
print(f"arr4 : {arr4} \n") # prints the numpy array


In [None]:
# Array Manipulation
# Create array
a = np.arange(12).reshape(3, 4) # Create array with range 0-14 in 3 rows and 5 columns
b = np.zeros((3,5)) # Create array with zeroes in 3 rows and 5 columns
c = np.ones( (2,3,4), dtype=np.int16 ) # Createarray with ones and defining the data type as int16
d = np.ones((3,4))* 5 # Create array with ones in 3 rows and 5 columns and multiply by 5

In [None]:
print(f"D : {d} \n") # prints the numpy array

In [None]:
# Transpose of an array
# d.T # Transpose of an array
d.transpose() # Transpose of an array

In [None]:
# Flatten of an array
d.flatten() # Flatten of an array
d.ravel() # Flatten of an array

In [None]:
# reshape of an array
d.reshape(2,6) # reshape of an array

In [None]:
# append axis 0 an array
arr = np.append(d,a,axis=0) # append all the elements of d and a in a single array along the axis 0 (row) ,returns a new array with the elements of d and a
arr

In [None]:
# append axis 1 an array
arr = np.append(d,a,axis=1) # append all the elements of d and a in a single array along the axis 1 (column) ,returns a new array with the elements of d and a
arr

In [None]:
#concatenate axis 0 an array
arr = np.concatenate((d,a),axis=0) # concatenate all the elements of d and a in a single array along the axis 0 (row) ,returns a new array with the elements of d and a
arr

In [None]:
# concatenate axis 1 an array
arr = np.concatenate((d,a),axis=1) # concatenate all the elements of d and a in a single array along the axis 1 (column) ,returns a new array with the elements of d and a
arr

In [None]:
# vertically stack two arrays
arr = np.vstack((d,a)) # vertically stack all the elements of d and a in a single array along the axis 0 (row) ,returns a new array with the elements of d and a
arr

In [None]:
# horizontally stack two arrays
arr = np.hstack((d,a)) # horizontally stack all the elements of d and a in a single array along the axis 1 (column) ,returns a new array with the elements of d and a
arr

In [None]:
# stack axis 0 (row) an array
arr = np.stack((d,a),axis=0) # stack all the elements of d and a in a single array along the axis 0 (row) ,returns a new array with the elements of d and a
arr

In [None]:
# stack axis 1 (column) an array
arr = np.stack((d,a),axis=1) # stack all the elements of d and a in a single array along the axis 1 (column) ,returns a new array with the elements of d and a
arr

In [None]:
# split axis 0 (row) an array
arr = np.split(d,3,axis=0) # split arr into 3 arrays along the axis 0 (row) ,returns a list of arrays
arr

In [None]:
# split axis 1 (column) an array
arr = np.split(d,2,axis=1) # split arr into 2 arrays along the axis 1 (column) ,returns a list of two arrays
arr

In [None]:
# vertically split an array
arr = np.vsplit(d,3) # split arr into 3 arrays along the axis 0 (row) ,returns a list of arrays
arr


In [None]:
# horizontally split an array
arr = np.hsplit(d,2) # split arr into 2 arrays along the axis 1 (column) ,returns a list of two arrays
arr

<a id="17"></a> <br>
## 17.Pandas 

* Pandas  is a library for data manipulation and analysis
* Pandas module provides a high-level interface to the data in DataFrame and Series objects.
* Pandas is a dataframe is a table-like data structure with labeled axes (rows and columns).
* in pandas, data is represented as a collection of Series objects that share an index.
* Pandas Series is a one-dimensional array with axis labels (rows or columns).
* Pandas DataFrame is a 2-dimensional data structure with axis labels (rows and columns).
* Pandas Panel is a 3-dimensional data structure with axis labels (items, major_axis, minor_axis)

In [None]:
# install pandas module
#!pip install -U pandas

In [None]:
# import pandas module
import pandas as pd
import numpy as np

# Sample DataFrame with Pandas
df = pd.DataFrame({'A': [1, 2, 3, 4],
                     'B': [5, 6, 7, 8],
                        'C': [9, 10, 11, 12]},
                        index=[1, 2, 3, 4])

df # prints the dataframe


In [None]:
# Another sample dataframe df1 - using NumPy array with datetime index and abeled column
df1 = pd.date_range('2019-01-01', periods=24, freq='M') # Create a date range from 1st Jan 2019 to 31th Dec 2020 with monthly frequency 
df1 = pd.DataFrame(np.random.randn(24, 4), index=df1, columns=['A','B','C','D'])
df1 # Display dataframe df1

In [None]:
# Another sample dataframe df2 - using NumPy array with datetime index and abeled column
df2 = pd.date_range('2019-01-01', periods=24, freq='H') # Create a date range from 1st Jan 2019 to 1st Jan 2019 with hourly frequency
df2 = pd.DataFrame(np.random.randn(24, 4), index=df2, columns=list('ABCD'))
df2 # Display dataframe df2

In [None]:
# display the index of dataframe df1
df1.index 

In [None]:
# view top 5 rows of dataframe df1
df1.head() # view top 5 rows of dataframe df1

In [None]:
# view bottom 5 rows of dataframe df1
df1.tail() # view bottom 5 rows of dataframe df1

In [None]:
# display the columns of dataframe df1
df1.columns

In [None]:
# display the shape of dataframe df1
df1.shape

In [None]:
# display the dtypes of dataframe df1
df1.dtypes

In [None]:
# display the values of dataframe df1
df1.values

In [None]:
# display Descriptive statistics of dataframe df1
df1.describe(include='all') 

In [None]:
# Transpose of an DataFrame
df1.T # Transpose of an DataFrame

In [None]:
# Sort the DataFrame by index
df1.sort_index(axis=0, ascending=False) # Sort the DataFrame by index axis=0 (row) and ascending=False (descending)


In [None]:
# Sort the DataFrame by column
df1.sort_index(axis=1, ascending=False) # Sort the DataFrame by column axis=1 (column) and ascending=False (descending)

In [None]:
# Sort the DataFrame by column and index
df1.sort_index(axis=0, ascending=False).sort_index(axis=1, ascending=False) # Sort the DataFrame by column and index axis=0 (row) and axis=1 (column) and ascending=False (descending)

In [None]:
# Sort values of DataFrame by column
df1.sort_values(by='A', axis=0, ascending=False) # Sort values of DataFrame by column A and axis=0 (row) and ascending=False (descending)

In [None]:
# select a column of DataFrame
df1.A # select a column of DataFrame


In [None]:
# select a column of DataFrame
df1["A"] # select a column of DataFrame

In [None]:
# select a row of DataFrame
df1[3:10] # select a index from 1 to 10 of DataFrame

In [None]:
df1['2020-01-02':'2020-06-04'] # Select from index '2020-01-02' to '2020-06-04' of DataFrame

In [None]:
# select a row index of DataFrame
df1.iloc[3] # Select via the position of the passed integers

In [None]:
# select a row  and column of DataFrame
df1.iloc[1:5,:] # select a row from 1 to 5 and all columns of DataFrame


In [None]:
# select a row and column of DataFrame
df1.iloc[1:5,0:2] # select a row from 1 to 5 and column from 0 to 2 of DataFrame

In [29]:
# select a row and column of DataFrame
df1.loc[:,['A','B']] # select a  all rows and column from A and B of DataFrame


In [None]:
# slice a row of DataFrame
df1.iloc[1:5,:-1] # slice a row from 1 to 5 and all columns except last column of DataFrame

In [None]:
# masking a DataFrame
df1.A > 0.5 # masking a DataFrame

In [30]:
# different ways to masking a DataFrame
df2=df1[df1.A > 0.5] 
#df2=df1.loc[df1.A > 0.5]
#df2=df1.loc[df1.A > 0.5,:]
df2


In [None]:
# copy a DataFrame
df2=df1.copy()
df2.head()

In [None]:
# add a column to a DataFrame
df2['E'] = pd.Series([i for i in range(df2.shape[0])], index=df1.index)
df2.head()

In [None]:
# add a column to a DataFrame
df2['F'] = 5
df2["a+b"]=df2["A"]+df2["B"] 
df2["a+c"]=df2["A"]+df2["C"]
df2.head()

In [31]:
# update a column of a DataFrame
df2.loc[df2.A > 0.5, 'F'] = None
df2.loc[(df2.B > 0.2), ['A','B']] = None
df2

In [None]:
# isin method for filtering

df2[df2["F"].isin([5])]


<a id="18"></a> <br>
## 18.Missing Data Handling

In [None]:
Df=df2.copy()

In [None]:
# check DataFrame has missing values
Df.isna()
Df.isnull()

In [None]:
# count missing values in DataFrame
Df.isna().sum()

In [None]:
Df.dropna(how='any') # Drop any rows that have missing data

In [None]:
Df.dropna(how='any',axis=1) # Drop any columns that have missing data

In [None]:
Df.dropna(how='any',thresh=5) # Drop any rows that have missing data

In [None]:
Df.dropna(how='any',axis=1) # Drop any rows that have missing data

In [None]:
# isnan for filtering 
df2[~df2.B.isna()] 

In [None]:
# drop a column of a DataFrame
Df.drop('F', axis=1, inplace=False)

In [None]:
# drop a row of a DataFrame
Df.drop(Df.index[:10], axis=0, inplace=False)

In [None]:
# fillna method for filling missing data
Df.fillna(method='ffill', axis=0) # Fill missing data using method forward fill

In [None]:
# fillna method for filling missing data
Df.fillna(method='bfill', axis=0,inplace=False) # Fill missing on rows using method backward fill

In [32]:
# check if a DataFrame has missing data
Df.isna().sum()

<a id="19"></a> <br>
## 19 Pandas File handling 


In [None]:
# File a DataFrame to a CSV file
Df.to_csv('Df.csv') # write a DataFrame to a CSV file


In [None]:
Df=pd.read_csv('Df.csv') # read a CSV file to a DataFrame
Df.head()

In [None]:
# set column names of a DataFrame
Df.columns = ["Tarih",'A', 'B', 'C', 'D', 'E', 'F','A+B','A+C']
Df.head()

In [None]:
#set index of a DataFrame
Df.set_index('Tarih', inplace=True) 
Df.head() 

In [None]:
# for Write a DataFrame to a Excel file ,install openpyxl
!pip install -U openpyxl 

In [None]:
# write a DataFrame to a Excel file
Df.to_excel('Df.xlsx') # write a DataFrame to a Excel file



In [None]:
# read a Excel file to a DataFrame
Df=pd.read_excel('Df.xlsx',engine='openpyxl',index_col=0) # read a Excel file to a DataFrame
Df.head()

<a id="20"></a> <br>
## 20.Matplotlib

**Matplot is a python library that help us to plot data. The easiest and most basic plots are line, scatter and histogram plots.**

* Line plot is better when x axis is time.
* Scatter is better when there is correlation between two variables
* Histogram is better when we need to see distribution of numerical data.
* Customization: Colors,labels,thickness of line, title, opacity, grid, figsize, ticks of axis and linestyle

In [None]:
# Matplotlib module for plotting data in Python 
# install matplotlib
#!pip install --upgrade matplotlib


In [None]:
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (20, 10) # set figure size
plt.style.use('ggplot') # set plot style


In [None]:
import numpy as np
import pandas as pd

# Generate a random DataFrame same like Time Series Data 1000 rows and 1 columns
df = pd.DataFrame(np.random.randn(1000)*100, columns=['Salary'],index=pd.date_range('1/1/2000', periods=1000))

df.head()

In [None]:
# plot a Time Series Data
plt.plot(df.index, df.Salary)

In [None]:
# On a DataFrame, the plot() method is convenient to plot all of the columns with labels
df4 = pd.DataFrame(np.random.randn(1000, 4), index=df.index,columns=['A', 'B','C', 'D'])
df4 = df4.cumsum()
df4.head()

In [None]:
# plot a DataFrame
df4.plot()
plt.show()

<a id="21"></a> <br>
## 21. Error Handling

In [None]:
# Error Handling in python is very important .
# common errors in python is list below:
# FileNotFoundError   
# AssertionError
# ZeroDivisionError
# IndexError
# ValueError
# TypeError
# AttributeError
# KeyError
# NameError
# ImportError
# IOError
# OSError
# RuntimeError
# RecursionError
# SyntaxError
# IndentationError
# TabError
# SystemError
# SystemExit
# UserWarning
# Warning
# UnicodeError

# we can use try and except to handle the error
# we can use assert statement to handle the error
# we can use raise statement to handle the error



In [None]:
import math

class Circle:
    def __init__(self, radius):
        if radius < 0:
            
            raise ValueError("positive radius expected")
        self.radius = radius
        self.area = math.pi * radius ** 2
        self.perimeter = 2 * math.pi * radius
        
    def set_radius(self, other):
        assert other > 0, "positive radius expected" #if the condition is not true, assert an exception
        self.radius = other
    def setradius2(self, other):
        if other < 0: #if this condition is true, raise an exception 
            raise ValueError("positive radius expected")
        self.radius = other
        
    def get_radius(self):
        return self.radius
    def area(self):
        
        return math.pi * self.radius ** 2
    def perimeter(self):
        
        return 2 * math.pi * self.radius
    def __str__(self):
        return "Circle with radius: " + str(self.radius)
    def __repr__(self):
        return "Circle(" +str(self.radius) +" -"+ str(self.area) + " -"+str(self.perimeter) + ")"

   
   

In [None]:
obj=Circle(-12) # ValueError: positive radius expected

In [None]:
circ=Circle(120)
str(circ) #same like __str__()

In [None]:
repr(circ) #same like __repr__()

In [None]:
circ.set_radius(-12) # ValueError: positive radius expected.if assert is used .function executes and its continues.

In [None]:
circ.get_radius()

In [None]:
circ.setradius2(-1512) # if raise error, function will stop ,dont need to return

In [None]:
circ.get_radius()

In [None]:
student_number = input("Enter your student number:")
print(type(student_number))
try:
    if not student_number.isnumeric():
        raise ValueError("Student number should be numeric")
    
    #assert type(student_number) in ["str"], f"please type number {student_number} is  a string"

    if int(student_number) != 0:
        print("Welcome student {}".format(student_number))
    else:
        print("Try again!")
except AssertionError as err:
    print(err)

In [33]:
class ErrorHand():
    def __init__(self, name, age):
        if not isinstance(name, str):
            raise TypeError("name must be a string")
        if not isinstance(age, int):
            raise TypeError("age must be an integer")
        self.person={
             "name": name,
             "age":age
        }
        
    def __str__(self):
        return "Name: {}, Age: {}".format(self.person["name"], self.person["age"])
    def __repr__(self):
        return "ErrorHand({}, {})".format(self.person["name"], self.person["age"])
    def __getitem__(self, key):
        if key == 0:
            return self.person["name"]
        elif key == 1:
            return self.person["age"]
        else:
            raise IndexError("Index out of range")
    def __setitem__(self, key, value):
        if key == 0:
            assert  isinstance(value, str), "name must be an string" #if the condition is not true, assert an exception
            self.person["name"] = value
        elif key == 1:
            assert isinstance(value, int), "age must be an integer" #if the condition is not true, assert an exception
            self.person["age"] = value
        else:
            raise IndexError("Index out of range")
    def __delitem__(self, key):
        if key == 0:
            self.person["name"] = ""
        elif key == 1:
            self.person["age"] = 0
        else:
            raise IndexError("Index out of range")
    def _sum(self):
        assert sum([1, 2, 3]) == 6

    def _len(self):
        assert len([1, 2, 3]) > 0

    def reversed(self):
        assert list(reversed([1, 2, 3])) == [3, 2, 1]

    def _membership(self):
        assert 3 in [1, 2, 3]

    def _isinstance(self):
        assert isinstance([1, 2, 3], list)

    def _all(self):
        assert all([True, True, True])

    def _any(self):
        assert any([False, True, False])

    def _always_fail(self):
        assert pow(10, 2) == 42

In [34]:
err=ErrorHand(25,"John")

In [None]:
err=ErrorHand("25","John")

In [None]:
err=ErrorHand("John",26)


In [None]:
str(err)


In [None]:
repr(err)


In [None]:
err[0]

In [None]:
err[0] = 35

In [None]:
err[1]=26

In [None]:
err[2]=45

<a id="19"></a> <br>
### 1. Error handling

In [None]:

def square(x):
    assert x > 0, "only positive numbers are allowed" #if the cond. is false ,throw an error
    return x ** 2

try:
    x=square(-2)
    print(x)
except AssertionError as error:
    print(error)

<a id="20"></a> <br>
### 20. Error handling

In [None]:

def square(x):
    if x < 0:# if the cond. is false throw an error
        raise ValueError("only positive numbers are allowed")
    return x ** 2

try:
    x=square(-2)
    print(x)
except ValueError as error:
    print(error)

> **I hope this notebooks is helpful.. update will soon**