- Here is a page for [Markdown Cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet)
- Here is a page for [Python Library Documentation](https://docs.python.org/3/library/)

**_Quick tips:_**
* Variables are stored in the Kernel (backend processing engine of the codes). If Kernel was to restart, variables are gone and have to be reinitialized.
* in command mode (blue border), L toggles line numbers in the code cell.
* in command mode, M and Y toggle between markdown and code mode.
* to comment our multiple lines, select them and ctrl + /
* to have no output when you open the file, from the Kernel menu, restart and clear output or go to cell, All output and clear
* to toggle between showing and not showing output, simply use the key O (I changed this to be Clear Cell Output shortcut)
* to edit keyboard shortcuts, click on H and assign or edit
* to bring the command pallete: ctrl + shift + P, then quickly type what you want and do it

1. Numerical/quantitative variables divided into continuous (floats) and discrete (integers)
2. Categorical variables are nominal (boolean, string, None), and ordinal (e.g. lists since they have indeces)
3. Mean of a list of boolean variables will count the number of True (1) values and divide by the number of items!
4. Shift + L puts a number on the lines of code so you can see how many lines are written in each cell. Pressing L on a cell will remove line numbers for that cell.

In [None]:
pip list # to see a list of the libraries installed on your machine

In [None]:
# importing some libraries that will be needed
import numpy as np
import pandas as pd
import math
import sys
import string

# Some General Functions

In [None]:
sys.version # to check the version of the python we're running

In [None]:
# %whos # to see what objects/variables, etc. we have in the memory - only works in IPython and not in shell or Jupyter!

In [None]:
n = [1, 3, 6, 8, 13, 1, 4]
print(type(n))
sum(n) # sum of the numbers in the list

In [None]:
len(n) # length of the list

In [None]:
x = round(math.pi, 4)  # the number pi from the math library
print(x) ; del x ; x

In [None]:
str(math.pi) # to convert data type to string

In [None]:
l = ([True, 6 == 5, 4 > 3, None is None, 7]) 
print(l)
sum(l)/len(l) # there are 3 true values and a 7 totalling 10 divided by 5 gets 2 (as float)

In [None]:
l.sort() # to sort the items inside a list which actually changes the ordring in that list too
l

In [None]:
np.mean(l)

In [None]:
[None]*5 # to create a list containing 5 None items
type(None)

In [None]:
a = np.array([0,1,2,3,4,5,6,7,8,9,10])
type(a)

# Data Types
In Python, there are several built-in data types: 
1. Numeric: int, float, complex
2. Sequences: list, tuple, range
3. Text: str
4. Mapping: dict
5. Set: set, frozenset
6. Boolean: bool
7. Binary: bytes, bytearray
8. None: NoneType

In addition to these built-in data types, Python also supports ***custom data types***, such as **classes** and **objects**.
Let's Look at the Sequences:

## Strings

In [None]:
txt = "She said \"Never let go\"."  # Backslashes (\) are used to escape characters
print(txt)
print("length of txt is " + str(len(txt)) + " characters")
print("m" in txt)

In [None]:
txt  = "Gishar"
print(txt[1] + ", " + txt[-1] + ", " + txt[4:6] + ", " + txt[:4] + ", " + txt[-3:])

In [None]:
for letter in txt:
    print(letter)

In [None]:
txt.lower() + ' ' + txt.upper()  # to make string all lower letter or all capital letters

In [None]:
("   There were three chars first and 5 at the end!     ").strip() # the strip method to remove chars from start and end

In [None]:
("   There were three chars first and 5 at the end!     ").strip().title() # to return string with first letter capitalized

In [None]:
txt.split('s') # use a character to split text

In [None]:
# practice with split method and set function! One "the" & "dove" remained. if upper was not used, the and The were there!
set("The dove dove into the water".upper().split())

In [None]:
# contiue from here: https://www.codecademy.com/learn/learn-python-3/modules/learn-python3-strings/cheatsheet

## Integer, Float, Boolean

In [None]:
x = 1 ; y = 2.3
type(x) # type to see the type of an object

In [None]:
float(x) # to convert to float

In [None]:
float(True) # convert boolean to float which gives 0.o for false and 1.0 for true

In [None]:
int(y) # to convert to integer

In [None]:
str(y) # to convert to string

In [None]:
print(1>3) # print the logical result of operation
bool(0) # convert 0 to boolean is equal to False and any other number is equal to 1

## Lists
Lists in Python are a collection of items, ordered and mutable. A list can contain items of different data types, such as integers, floating-point numbers, strings, and so on. Lists are defined using square brackets, with the items separated by commas. 

Lists allow you to store, manipulate and retrieve elements efficiently, making them one of the most commonly used data structures in Python. You can access individual elements using indexing, add elements to a list using the _append_ method, remove elements using the _remove_ method, and _sort_ the list using the sort method, among other operations.

In [None]:
heights = [["Noelle", 61], ["Ava", 70], ["Sam", 67], ["Mia", 64]] # 2D list is a good structure for representing grids
myclass = [
        ["Kenny", "American", 9],
        ["Tanya", "Ukrainian", 9],
        ["Madison", "Indian", 7]
        ]
print(heights[2][1]) # to call an item in a 2D list we use two [], first to call the list index, second to call the item index in the list
print(myclass[1]) # print the whole list in the 2D list called by its index
print(myclass[-2][-2])

In [None]:
myclass[1][1] = "Albanian" # same index as [-2][-2]
print(myclass[-2][-2])

In [None]:
myclass[1].remove("Tanya") # to remove or append an item from/to a list inside a 2D list, apply the function on the specific list inside!
myclass

In [None]:
myclass.sort() # to sort the list (based on the first elements of each list inside a 2D list)
myclass

In [None]:
x = [] # to create an empty list in an implicit way
x = list() # to create an empty list in an explicit way

In [None]:
x = [3, 1, 20, 2, 2, 3, 5, "hi"]  # a list can contain anything
print(x[1]) # to see the item on index 1 (indices on lists start from 0)
print(x[2:6]) # to see the items on index 2 to 5 (last number exclusive)

In [None]:
x.count(3)

In [None]:
x.append(2)
x

In [None]:
# different ways to update or extract info from a list
x.append([True]) # Appends the item to the end of the list
x.append(True) # Appends the item to the end of the list
x

In [None]:
x[1:2]=["Two", "three"]
x += [23, "No", False] # another way to add to a list other than using append which only adds 1 item. we can use extend too
x

In [None]:
# [None] is an list with 1 Null or No-value itemNone is not the same as 0, False, or an empty string. None is a data type of its own (NoneType) and only None can be None.
x.append([None]*5)

In [None]:
x.remove(2) # removes the item from the list the first time it shows up in the list (for removing by index, use pop() method)

In [None]:
x = x[0:5]

In [None]:
x.extend([6, "Yes", False])  # to append more than 1 item, we extend: equivalent to concatenating two lists

In [None]:
x.insert(0, 3) # to insert a new entry into a list using index: .insert(index, value)
x.insert(-4, "text this time")

In [None]:
x.pop(1) # to remove an item from a list using index: .pop(index) - without an index it simply removed the last one
x

In [None]:
r = x.pop(1) # we can save the removed value to a variable if we care to use it later
print(r)  # stores the removed (popped) item into another variable
print(x)

In [None]:
name = list("GISHAR")
print("name = ", name) # this is what list function does to strings.

In [None]:
name.sort() # this is a method to sort the original list. after this is done, if you print the list, you'll see the sorted version
print("soreted name = ", name)

In [None]:
name.sort(reverse=True)
print("reverse sorted name = ", name)

### Numpy Arrays
NumPy arrays are **multi-dimensional arrays** used in numerical computing with Python. They allow for fast computation and manipulation of large arrays of data. They support a variety of operations including *element-wise operations, slicing, and indexing*. **NumPy arrays are more efficient than regular Python lists for large data sets and are widely used in data analysis and scientific computing**.

Let's look at these common stuff from np: 

`np.array()`, `np.arange()`, `np.zeros()`, `np.ones()`, `np.linspace()`, `np.eye()`, 

`np.random.rand()`, `np.random.randn()`, `np.random.randit()`,  

`reshape()`, `.shape`, `.dtype`, `.max()`, `.argmax()`, `.min()`, `.argmin()`

In [None]:
np.random.seed(101) # to use a specific seed for generating random numbers

In [None]:
mylist = [3, 1, 20, 2, 2, 3, 5]
nplist = np.array(mylist)
print(mylist)
print(nplist)

In [None]:
mymatrix = [[1,2,3],[4,5,6],[7,8,9]]
npmatrix = np.array(mymatrix)
print(mymatrix)
print(npmatrix)

In [None]:
nprange = np.arange(1, 50, 3) # generating a range using numpy's arange function
print(nprange)

In [None]:
x = np.zeros(4) # generate a 1x4 matrix of zeros
y = np.zeros((3,3)) # generate a 3x3 matrix of zeros
z = np.ones((2,3)) # generate a 2x3 matrix of ones
print(x) ; print(y) ; print(z)

In [None]:
x = np.linspace(0.5, 20, 40) # generate 40 numbers between 0.5 and 20, evenly spaced, starting from 0.5
y = np.eye(4) # genearte identity matrix of NxN
print(x); print(y) 
y.shape # dimension of the object

In [None]:
np.random.rand(3, 2, 3) # generate random numbers from uniform distribution between 0 and 1 placed in the form asked

In [None]:
np.random.randn(3, 2) # generate random numbers from std normal distribution placed in the form asked

In [None]:
np.random.randint(1, 100, 6) # generate 6 random numbers between 1 and 100

In [None]:
x = np.arange(30)
print(x)
x.reshape(5,6) # to reshape the data into a matrix form 

In [None]:
# argmax returns the index of the max and similarly argmin
x = np.random.randint(1, 100, 6)
print(x); print(x.max()); print(x.argmax()); print(x.min()); print(x.argmin())

In [None]:
print(x.dtype) # type of the data stored in the object
print(type(x)) # type of the object itself

### Range
a built-in function that generates a sequence of numbers. It allows you to generate a sequence of numbers from a starting number to an ending number with a specified step.

In [None]:
y = range(10) # range(start, end/exclusive, step) is unique in that it creates a range object! to use it as a list, we have to convert it using the list() function
print(y)
list(y)

In [None]:
z = list(range(2, 20, 2))
print(z)
len(z) # to see the length of a list

In [None]:
names = ["Jenny", "Alexus", "Sam", "Grace"]
heights = [61, 70, 67, 64]
# takes two (or more) lists as inputs and returns an object that contains a list of pairs. Each pair contains one element from each of the inputs.
names_and_heights = zip(names, heights)
print(names_and_heights)
print(list(names_and_heights))

## Tuple
A tuple is a collection of **ordered, immutable, and heterogeneous** data elements in Python. It's similar to a list, but you can't modify its elements once it's created. Tuples are declared using round brackets ( ) and its elements are separated by commas.

Tuples are commonly used to represent ordered collections of **data that shouldn't change**, for example, coordinates in a map, or dates in a calendar. They're also **faster than lists for some operations and use less memory, making them ideal for data that doesn't need to be modified**.

In [None]:
mytuple = ('Mike', 24, 'Programmer')
mytuple[0] # works the same way as in lists
# mytuple[0] = "Joe" # this will error due to immutability of tuples

In [None]:
print(mytuple[1:]) # all the same like those in list but nothing can change with tuples
name, age, occupation = mytuple # unpacking a tuple by putting variables equal to tuple (no of vars equal to no of elements in tuple - order matters)
x = (4,) # to create a 1-element tuple we need the , in there. Otherwise, it's just a number and not a tuple
x

## Set
A set in Python is an collection of unique elements. Sets are commonly used to **remove duplicates from a list** or to perform **mathematical set operations such as union, intersection, and difference**. They are defined using curly braces {} or the built-in **set()** function.
- lists and tuples can have repetitive values, unlike sets
- it seems that sets are ordered, no matter how you enter the values when defining it!
- tuples can't change! sets and lists can

In [None]:
fruits = {"apple", "banana", "cherry", "apple", "apple", "cherry"}
fruits

In [None]:
mylist = [1, 20, 2, 2, 3, 5, 3, 2, 3, "hi"]
print(mylist) ; print(type(mylist))
myset = set(mylist)
print(myset) ; print(type(myset))
mytuple = (1, 20, 2, 2, 3, 5, 3, 2, 3, "hi")
print(mytuple) ; print(type(mytuple))

In [None]:
y.add(1.5) # to add an element to a set (can't be done to a tuple) (for list, we use append attribute)
y

In [None]:
mytuple.count(2) # how many of these values are in the tuple (count is used in lists too)

In [None]:
mytuple.index(3) # provide the index number for the value

In [None]:
mylist.index(3) # the index for the only the first time values shows up in the list

## Dictionary
A dictionary is a collection of key-value pairs. It is an **unordered, mutable, and indexed data structure**. You can access, add, remove, and update the items in a dictionary using the keys. The keys in a dictionary must be unique and can be of any hashable data type (e.g. string, integer, etc.). The values can be of any data type. The syntax to define a dictionary is using curly braces {} with key-value pairs separated by colons. For example: my_dict = {'key1': 'value1', 'key2': 'value2'}

In [None]:
person = {"name": "John Doe", "age": 30, "city": "New York", 'Favorite Numbers':[3, 13, 42]} # create a dictionary
type(person)

### Pandas Dataframe
A dataframe is another name for dictionary when we use pandas library to work with dictionaries. It is a two-dimensional labeled data structure that can store data of different types. It is a commonly used data structure for data analysis and manipulation. Dataframes are created and manipulated using the Pandas library in Python. They are similar to spreadsheets or SQL tables, allowing you to store, manipulate, and analyze data efficiently. Each column in a dataframe is a pandas Series, and each row is represented by an index. Dataframes have methods and attributes that allow you to perform operations such as selecting rows and columns, filtering, grouping, and aggregating data, and much more.

In [None]:
pd.DataFrame(person) # making a dataframe from the dictionary created above using pandas

In [None]:
type(pd.DataFrame(person))

In [None]:
# first takes a matrix, then index = for rownames, columns = for colnames
np.random.seed(101)
df = pd.DataFrame(np.random.randn(5, 4), 
                  index = 'A B C D E'.split(),
                  columns='W X Y Z'.split()
                 )
df

In [None]:
df.index # to extract the index of a dataframe

In [None]:
df['W'] # to get a column

In [None]:
df[['X', 'Y']] # to get multiple columns

In [None]:
df[0:1] # to get rows by row numbers

In [None]:
df.loc['A'] # also to get rows by index labels

In [None]:
df.iloc[0] # also to get rows by index number

In [None]:
df.loc[['C', 'D'], ['X', 'Y']] # to get a bunch of rows and columns from dataframe using index labels

In [None]:
df.iloc[2:4, 1:3] # to get a bunch of rows and columns from dataframe using index labels

In [None]:
df>0 # to make a condition out of the data in the dataframe

In [None]:
df[df>0]

In [None]:
df[df['W']>0]

In [None]:
df[df['W']>0]['X']

In [None]:
df[df['W']>0].iloc[1]

In [None]:
df[(df['W']>0) & (df['Y'] > 1)]

In [None]:
# drop = True to remove the original index / inplace is used to replace the original df
df2 = df.reset_index(drop=True, inplace=False) 
df2

In [None]:
df2['MyIndex'] = list(string.ascii_uppercase)[10:15] 
df2.set_index('MyIndex', inplace=True) 
df2

<div style="line-height:0.5">
<strong> Multi-Index in Pandas </strong>

Creating *multi-indexed* dataframe using Multiindex in Pandas 
`.from_tuples` is to make something from a list of tuples
</div>    

In [None]:
outside = ['G1','G1','G1','G2','G2','G2'] 
inside = [1,2,3,1,2,3]
hier_index = list(zip(outside, inside)) # zip creates tuples of pairs for each outside and corresponding inside lists
# print(hier_index)
hier_index = pd.MultiIndex.from_tuples(hier_index) # use the list of zipped tuples and make a MultiIndex object
# print(hier_index)
df = pd.DataFrame(np.random.randn(6, 2), # make a dataframe with the MultiIndex object as the index
                  index = hier_index,
                  columns = ['A','B']
                 )
df

In [None]:
df.loc['G2']

In [None]:
df.loc['G2'].loc[3]['B']

In [None]:
df.index.names

In [None]:
df.index.names = ['Group', 'Number']
df

In [None]:
df.xs(('G1', 3)) # returns a specific row in multiindexed dataframe

In [None]:
df.xs(2, level = 'Number') # returns the row from all indexes where the Number index = 2

In [None]:
df = pd.read_csv("https://raw.githubusercontent.com/gishar/Learning_Python/main/Starwars.csv")
df.head(4)

In [None]:
type(df)

In [None]:
df.columns # to see the name of the variables / columns

In [None]:
df.head(3) # see the top 3 rows of data

In [None]:
df.tail(2)

In [None]:
# .loc uses labels/column names to call the data from the dataframe
df = df.loc[:, ["cones", "ntrees", "dbh", "height", "cover", "sntrees", "sheight", "scover"]] # removed the index column
df.columns # to see the name of the variables / columns

In [None]:
df.loc[:,"ntrees"].head(2)

In [None]:
df.loc[:3, ["cones", "ntrees", "cover", "height"]] # show 3 observation for these columns in order that is written

In [None]:
df.loc[5:7] # sepefic rows for all columns

In [None]:
# .iloc uses integer numbers to slice the dataframe
df.iloc[:4] # first 4 rows of the dataframe
df.iloc[1:5, 2:4] # rows 1 to 5 (exclusive) and columns 2 to 4 (exclusive on the 4) - note, index starts from 0

In [None]:
df.cones.unique() # to get the unique values in a column of a dataframe

In [None]:
df['cones'].unique()

In [None]:
# df.groupby(["cones", "ntrees"]).size()  # to group by the combination of two or more variables and show a sumamrization value (size, mean, etc.)

In [None]:
pd.crosstab(df['hair_color'], df['sex']) # cross tabulation between two variables

In [None]:
df['sex'].value_counts() # find count of unique values for each category 

# Functions

In [None]:
def myfunction():
    """
    This is a description of what this function do. It comes handy when reviewing or sending it to other people
    """
    print('Hello')
    print('Oh this is fun')
myfunction()

In [None]:
# an example of creating a function
def biglittle():
    text_with_no_space = input("write some text without spaces:")
    funny = max(text_with_no_space) + " " + min(text_with_no_space)
    return funny

In [None]:
biglittle()

In [None]:
# another example of a function
def greet(lang): # the function receives one input, simply labeled lang here and does work with it and returns something 
    if lang == "es":
        print("Hola!")
    elif lang == "fr":
        print("Bonjour!")
    else:
        print("Hello!")

greet("fr") # sends the first parameter into the function

In [None]:
# another example of a function
def greet(lang): # the function receives one input, simply labeled lang here and does work with it and returns something 
    if lang == "es":
        return "Hola!"
    elif lang == "fr":
        return "Bonjour!"
    else:
        return "Hello!"

print(greet("fr"), "Jean-claude")

In [None]:
# another example of a function
def addtwo(x, boo): # a function can have receive mroe than one parameter, label them as they come in, work with them and return 
    """
    This function adds the two input numbers and returns their sum
    """
    added = x + boo
    return added # return usually is the last line in a fucntion

try:
    a = float(input("Enter first number: "))
    b = float(input("Enter second number: "))
    print("the sum is equal to: ", addtwo(a,b))
except:
    print("Can't do that! Please enter numbers only")

# Indefinite Loop (While)

In [None]:
# While loops are called indefinite loops but mostly we use definite loops
n = 5 # this is the iteration variable - will need to change or otherwise the loop goes forever
while n > 0:
    print(n)
    n -= 1

print("It's Over")

In [None]:
n = 0
while True:
    line = input("Say my name: ")
    if line == "Done":
        break
    print(line)

print("Finally it's over!")

# Definite Loop (For)

In [None]:
n = list(range(1,6)) # just defining a list of numbers to work with in the For loop
for x in n:
    print(x)
    print("Oy!")

In [None]:
x = range(1, 20, 5)
out = []
for item in x:
    out.append(item**2)
print(out)

In [None]:
guys = ['ali', 'hasan', 'haji']
for name in guys:
    print("Happy new year, ", name)

In [None]:
a = 'abc&xyz'  # string
b = 3.16  #float
c = {2:'a', 3:'b'}  # dictionary
d = 6 < 2  # boolean
e = [1,2,3,4,5] # list
f = sum(e)  # value returned from a function
g = sum # a function

for obj in [a,b,c,d,e,f,g]:
    print(obj)