<p><a name="sections"></a></p>


# Python Cheatsheet
- <a href="#python tools">**I. Python Tools**</a>  
  - <a href="#boolean">**Boolean** Checks</a>
  - <a href="#lambda">**Lambda** Functions and Named Functions</a>
  - <a href="#modules">Making and Using **Modules**</a>
  - <a href="#lists">**Lists and List Operations**</a>
    - <a href="#listsrandom">**Randomly retrieving items** from a list: **random.choices()**</a>
    - <a href="#listszip">**Zipping** lists together</a>
    - <a href="#lists+">**Adding** things to lists</a>
      - <a href="#listsefficient">**List Efficiency** Optimized Ways to Grow a List</a>
    - <a href="#lists-">**Removing** things from lists</a>
    - <a href="#listscontents">**Information** on list contents</a>
    - <a href="#listsarrangement">**Arranging** list contents</a>
    - <a href="#listsiterations">**Iterating** through list contents</a>
    - <a href="#listsmap">**Applying a function** on each list element: **map**</a>
    - <a href="#listsfilter">**Filtering** based on some condition: **filter**</a>
  - <a href="#strings">**Strings and String Operations**</a>
    - <a href="#stringfind">**Finding** a substring</a>
    - <a href="#stringescapes">**Adding spaces and escape characters** to strings</a>
    - <a href="#stringlist">**Converting** between lists and strings</a>
    - <a href="#stringjoin">**Combining** strings with **join**</a>
    - <a href="#stringreplace">**Replacing** with **replace**</a>
    - <a href="#prefix">**Adding a PREFIX or SUFFIX** with list comprehension</a>
  - <a href="#loops">**FOR** and **WHILE** loops</a>
    - <a href="#enumerate">**return index and element**</a>
- <a href="#numpy">**III. Numpy**</a>
  - <a href="#Random">**Random Generators**</a>
  - <a href="#numspace">Creating an array of numbers **spaced over a certain interval**</a>
- <a href="#machine learning">**V. Machine Learning**</a>

### <p><a name="python tools"></a></p>
# I. Python Tools

###### This will be a brief overview of Python tools that I need. Machine learning-specific Code will be in section II.

###### <p><a name="boolean"></a></p>
### Boolean Checks

To confirm whether or not 2 lists point to the same object/memory address, call

```python
     a is b
 ```
 
 **for strings, a is c will return true!!**

In [66]:
a = b = [1,2,3]
c = [1,2,3]

In [67]:
#a = b
print(a == b)
print(a == c)
print(a is c)

True
True
False


###### <p><a name="lambda"></a></p>
### Lambda Functions and Named Functions

In [1]:
import math

#regular notation
def vector_length(x, y):
    return math.sqrt(x**2 + y**2)

#lambda notation

Vector_length = lambda x, y: math.sqrt(x**2 + y**2)

###### <p><a name="modules"></a></p>
### Making and using Modules

**Autoreload**: this will reload my modules before executing the code. I need this when the module is updated, so that when the function is called, it reflects the latest version of the module (with changes)

```python
    %load_ext autoreload
    %autoreload 2
    ```

**Wrap my function in Python module**

1. Make a .py file which I'll use as a Python module.
2. Define functions I'll want to use later and call the function by importing the module.

For example, create a file called `my_module.py` in the same folder of my iPython notebook. Make a function in `my_module.py`.

    
    def my_function(x):
        return x*2
    
3. call it by using this code(1st line for importing only 1 function, 2nd line for all):

    ```python
    from my_module import my_function
    from my_module import*
    ```
    


###### <p><a name="lists"></a></p>
### List Operations and Functions

###### <p><a name="listsrandom"></a></p>
**Sampling Randomly** from a list: **generating a random item from a list with `random.sample()`**

```python
random.choices(sequence,k = nubmer of items to return from the sequence)
```

In [105]:
import random
random.choices(['Post','Comment','Photo','Share'],k = 7)

['Photo', 'Comment', 'Comment', 'Photo', 'Post', 'Share', 'Post']

###### <p><a name="listszip"></a></p>
**Zipping** lists together: **combining lists with `zip()`**

In [2]:
a = [1, 2, 3, 4]
b = [5, 6, 7, 8]
c = list(zip(a, b))
c

[(1, 5), (2, 6), (3, 7), (4, 8)]

In [8]:
#unzipping back into separate lists
d = list(zip(*c)) #unzipping the list makes it a list of tupples.
e = list(d[0])
f = list(d[1])
print(e,f,d, sep = '\n')

[1, 2, 3, 4]
[5, 6, 7, 8]
[(1, 2, 3, 4), (5, 6, 7, 8)]


###### <p><a name="lists+"></a></p>
**Adding** things to a list: **list concatenation with +**

In [17]:
autobots = ['Optimus Prime', 'Omega Supreme','Grimlock','Tracks','Mirage']
decepticons = ['Megatron','Devastator','Shockwave','Thrust','Skywarp']

autobots + decepticons

['Optimus Prime',
 'Omega Supreme',
 'Grimlock',
 'Tracks',
 'Mirage',
 'Megatron',
 'Devastator',
 'Shockwave',
 'Thrust',
 'Skywarp']

**Adding** things to a list: **`.append()` method**, MUTATING

In [11]:
autobots.append('Sky Lynx') #MUTATING
autobots

['Optimus Prime', 'Omega Supreme', 'Grimlock', 'Tracks', 'Mirage', 'Sky Lynx']

**Adding** things to a list: **`.insert()` method**, MUTATING

In [18]:
autobots.insert(autobots.index('Tracks'),'Swoop')
autobots

['Optimus Prime',
 'Omega Supreme',
 'Grimlock',
 'Swoop',
 'Tracks',
 'Mirage',
 'Sky Lynx']

**Adding** things to a list: **`.extend()` method**, MUTATING

In [18]:
decepticons.extend(['Hook','Longhaul','Scrapper','Mixmaster','Bonecrusher','Scavenger'])
decepticons

['Megatron',
 'Devastator',
 'Shockwave',
 'Thrust',
 'Skywarp',
 'Hook',
 'Longhaul',
 'Scrapper',
 'Mixmaster',
 'Bonecrusher',
 'Scavenger']

**Adding** things to a list: **LIST COMPREHENSION**

In [3]:
l = [i for i in range(11)]
l

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

==============================================================
###### <p><a name="listsefficient"></a></p>
### Adding to Lists: Efficiency and Optimization

In [6]:
#concatentation
def test1():
    l = []
    for i in range(1000):
        l = l + [i]

#append
def test2():
    l = []
    for i in range(1000):
        l.append(i)

#list comprehension        
def test3():
    l = [i for i in range(1000)]

#initiating a range. Initiating a range is the most efficient, followed by list comprehension!!    
def test4():
    l = list(range(1000))

In [10]:
import timeit
from timeit import Timer

t1 = Timer("test1()", "from __main__ import test1")
print("concat ",t1.timeit(number=1000), "milliseconds")
t2 = Timer("test2()", "from __main__ import test2")
print("append ",t2.timeit(number=1000), "milliseconds")
t3 = Timer("test3()", "from __main__ import test3")
print("comprehension ",t3.timeit(number=1000), "milliseconds")
t4 = Timer("test4()", "from __main__ import test4")
print("list range ",t4.timeit(number=1000), "milliseconds")

concat  0.9261318519993438 milliseconds
append  0.06539415499992174 milliseconds
comprehension  0.032425284999590076 milliseconds
list range  0.013522391999686079 milliseconds


An illustration of the time it takes to execute `pop()` from the beginning of the list
vs. from the end of the list: End of the list is more efficient.

In [12]:
popzero = Timer("x.pop(0)",
                       "from __main__ import x")

popend = Timer("x.pop()",
                       "from __main__ import x")

x = list(range(2000000)) #takes much longer
print(popzero.timeit(number = 1000))

x = list(range(2000000))
print(popend.timeit(number = 1000))


0.818389271000342
6.128799941507168e-05


In [6]:
a = [3,2,1,1,1]

i = 0

while i < (len(a)):
    print(a[i])
    i += 1



3
2
1
1
1


In [7]:
L = list(range(10))
from itertools import accumulate

test = accumulate(a)

print(list(test))


[3, 5, 6, 7, 8]


In [9]:
import numpy as np
l=list(range(10))
avg = np.mean(l)
sum_ = accumulate(l)
print(list(sum_))
print(l)
print(avg)
diff = [v - avg for v in l]
print(sum(diff))

[0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
4.5
0.0


========================================================================

###### <p><a name="lists-"></a></p>
**Removing** things from a list: **list demotion with `.clear()`**

In [26]:
autobots.clear()
autobots

[]

**Removing** things from a list: **`.pop()` method** MUTATING with SIDE EFFECTS

In [20]:
decepticons.pop()

'Scavenger'

**Removing** things from a list: **`.remove()` method** MUTATING

In [31]:
decepticons.remove('Devastator')
decepticons

['Megatron',
 'Shockwave',
 'Thrust',
 'Skywarp',
 'Hook',
 'Longhaul',
 'Scrapper',
 'Mixmaster',
 'Bonecrusher']

In [116]:
help(list.remove)

Help on method_descriptor:

remove(self, value, /)
    Remove first occurrence of value.
    
    Raises ValueError if the value is not present.



###### <p><a name="listscontents"></a></p>
**Information about list contents** counting occurances of a value: **`.count()` method** NON MUTATING

In [34]:
decept_ranks = [10,9,6,6,6,5,7,5,5]
decept_ranks.count(6)

3

**Information about list contents** finding the index location of a value: **`.index()` method** NON MUTATING

In [35]:
decept_ranks.index(10)

0

###### <p><a name="listsarrangement"></a></p>
**Sorting** list contents: **`.sort()` method** MUTATING

In [40]:
decepticons.sort()
decepticons

['Bonecrusher',
 'Hook',
 'Longhaul',
 'Megatron',
 'Mixmaster',
 'Scrapper',
 'Shockwave',
 'Skywarp',
 'Thrust']

**Sorting** list contents: **sortED CALL `sorted()`** NON MUTATING

In [43]:
sorted(decepticons, key = len)

['Hook',
 'Thrust',
 'Skywarp',
 'Longhaul',
 'Megatron',
 'Scrapper',
 'Mixmaster',
 'Shockwave',
 'Bonecrusher']

**Reversing order** of a list: **sortED CALL `sorted()`** NON MUTATING

In [45]:
sorted(decepticons, key = len, reverse = True)

['Bonecrusher',
 'Mixmaster',
 'Shockwave',
 'Longhaul',
 'Megatron',
 'Scrapper',
 'Skywarp',
 'Thrust',
 'Hook']

In [1]:
list_=[4,10,14,19]
print(min(list_))

4


**Reversing order** of a list: **`.reverse()` METHOD** MUTATING

In [47]:
#decepticons.reverse()
decepticons

['Thrust',
 'Skywarp',
 'Shockwave',
 'Scrapper',
 'Mixmaster',
 'Megatron',
 'Longhaul',
 'Hook',
 'Bonecrusher']

###### <p><a name="listsiterations"></a></p>
**Iterating** through list contents: **`iter()`** and **`next()`** FUNCTIONS
```Python
    iter()
    next()
    ```
Can be used for ANY iterable (like dictionaries, tuples) not just lists!

In [49]:
autobots = ['Optimus Prime','Grimlock','Omega Supreme','Jazz','Cosmos']
iter_autobot = iter(autobots)

'Optimus Prime'

In [51]:
next(iter_autobot)

'Omega Supreme'

In [52]:
list(iter_autobot) #returns the remaining items that have NOT been iterated yet as a list!

['Jazz', 'Cosmos']

###### <p><a name="listsmap"></a></p>
**Applying a function** through list contents: **`map()`**
```Python
    map(function,list)
    ```
Applies the function to every element in the list

In [53]:
decept_power = [10,8,8,7,6,9,2,9,5]
list(map(lambda x: x + 1,decept_power))

[11, 9, 9, 8, 7, 10, 3, 10, 6]

In [9]:
#using map with more than 2 lists
list(map(lambda x, y: x[y], [[1, 2], [2, 3]], [0, 1]))

[1, 3]

In [82]:
#using map with zipped lists
f1 = [93,97,101,43,55]
f2 = [10,211,5,88,67]
#here the zipped list is indexed into using z, so each element in tupple z[0] is multiplied
#by each element in tupple z[1]
list(map(lambda z: z[0]*z[1],zip(f1,f2)))

[930, 20467, 505, 3784, 3685]

###### <p><a name="listsfilter"></a></p>
**Filtering** a list by some condition: **`filter()`**
```Python
    filter(condition,list)
    ```
Filters the whole list according to some condition.

In [68]:
list(filter(lambda x: x > 8, decept_power))

[10, 9, 9]

Combined with **`map()`** to perform an operation on every element of a filtered list:

In [69]:
list(map(lambda x: x+3,filter(lambda x: x > 8, decept_power)))

[13, 12, 12]

using `filter()` to remove lists that just have empty strings as some of the elements:

In [70]:
messy_list = ['value1    ','', 'value2      ','','','value3      ','     value4   ']
list(filter(lambda s: s != '',list(map(lambda s: s.strip(),messy_list))))

['value1', 'value2', 'value3', 'value4']

###### <p><a name="strings"></a></p>
### String Operations and Functions

###### <p><a name="stringfind"></a></p>
**Finding a substring** when using the `.find()` method.

In [1]:
'Overlord'.find('lord')

4

In [5]:
'Black Zarak'.find('Zarak', 4)

6

###### <p><a name="stringescapes"></a></p>
**Creating spaces** when using the `print()` call.

In [7]:
print('Zeta Prime','Sentinel Prime', 'Optimus Prime', 'Rodimus Prime', sep = '\t') #tab space
print('Zeta Prime','Sentinel Prime', 'Optimus Prime', 'Rodimus Prime', sep = '\n') #newline

Zeta Prime	Sentinel Prime	Optimus Prime	Rodimus Prime
Zeta Prime
Sentinel Prime
Optimus Prime
Rodimus Prime


**Declaring a raw string** to avoid confusing \ with escape characters

In [9]:
print(r'C:\Documents\Emanuel')

C:\Documents\Emanuel


###### <p><a name="stringlist"></a></p>
**Convert a string to a list** of single-character strings using the function **`list()`**:

In [25]:
list('Autobots')

['A', 'u', 't', 'o', 'b', 'o', 't', 's']

###### <p><a name="stringlist"></a></p>
**Convert a string to a list** of single-character strings using the function **`split()`**:

In [1]:
d = 'Soundwave, Thundercracker, Starscream and Skywarp, attack'
d.split(', ')

['Soundwave', 'Thundercracker', 'Starscream and Skywarp', 'attack']

**Convert a list** to a **string** using **`join(list)`**:
```Python
    "".join(list) #double quotes mean empty string
```

In [26]:
"".join(['A', 'u', 't', 'o', 'b', 'o', 't', 's'])

'Autobots'

###### <p><a name="stringjoin"></a></p>
**Combining strings** with **`join()`**
```Python
    .join(list of strings):
```
Recall that join needs a list passed in.

In [27]:
print(' '.join(['Orion Pax', 'Optimus Prime'])) #with a space
print(''.join(['Orion Pax', 'Optimus Prime'])) #no space
print('Alpha Trion'.join(['Orion Pax', 'Optimus Prime'])) #this will append the join calls to
                                                          #the Alpha Trion string object.

Orion Pax Optimus Prime
Orion PaxOptimus Prime
Orion PaxAlpha TrionOptimus Prime


In [28]:
primes = ['Zeta Prime','Sentinel Prime', 'Optimus Prime', 'Rodimus Prime']
print(' '.join(primes))
slogan = ['Autobots','wage','their','battle','to','destroy','the','Decepticons']
print(' '.join(slogan))
print(', '.join(primes))

Zeta Prime Sentinel Prime Optimus Prime Rodimus Prime
Autobots wage their battle to destroy the Decepticons
Zeta Prime, Sentinel Prime, Optimus Prime, Rodimus Prime


###### <p><a name="stringstrip"></a></p>
**Stripping out spaces** with `.strip()`
```Python
    .strip(string):
```

In [29]:
s = '    Too     many      spaces   .'

###### <p><a name="stringreplace"></a></p>
**replacing strings** with `.replace()`
```Python
    .replace(old,new):
```

###### <p><a name="prefix"></a></p>
**Adding a prefix or suffix** with `%`
```Python
    ['prefix_%s' % column for column in columnlist]:
```

In [1]:
gm_columns = ['ds',
 'trend',
 'yhat_lower',
 'yhat_upper',
 'trend_lower',
 'trend_upper',
 'additive_terms',
 'additive_terms_lower',
 'additive_terms_upper',
 'weekly',
 'weekly_lower',
 'weekly_upper',
 'yearly',
 'yearly_lower',
 'yearly_upper',
 'multiplicative_terms',
 'multiplicative_terms_lower',
 'multiplicative_terms_upper',
 'yhat']

In [5]:
gm_prefix = ['gm_%s' % column for column in gm_columns]
gm_prefix[:5]

['gm_ds', 'gm_trend', 'gm_yhat_lower', 'gm_yhat_upper', 'gm_trend_lower']

In [6]:
gm_suffix = ['%s_gm' % column for column in gm_columns]
gm_suffix[:5]

['ds_gm', 'trend_gm', 'yhat_lower_gm', 'yhat_upper_gm', 'trend_lower_gm']

###### <p><a name="loops"></a></p>
### FOR and WHILE Loops

###### <p><a name="for"></a></p>
Functions to use in tandem with a for loop.

###### <p><a name="enumerate"></a></p>
Return a tuple of **index and value** with `enumerate()`
```Python
    enumerate(list_ or dataframe series):
```

In [1]:
titans = ['Metroplex', 'Fortress Maximus', 'Trypticon', 'Omega Supreme', 'Scorponok']
for i, e in enumerate(titans): #gives position, and then element at that position.
    print(i, e)

0 Metroplex
1 Fortress Maximus
2 Trypticon
3 Omega Supreme
4 Scorponok


In [2]:
for x in enumerate(titans): #gives position, and then element at that position.
    print(x[0], x[1]) #name of x isn't as meaningful, but this has the same result

0 Metroplex
1 Fortress Maximus
2 Trypticon
3 Omega Supreme
4 Scorponok


In [3]:
#return a list of the tuples.
list(enumerate(titans))

[(0, 'Metroplex'),
 (1, 'Fortress Maximus'),
 (2, 'Trypticon'),
 (3, 'Omega Supreme'),
 (4, 'Scorponok')]

Used in connjunction with a dataframe series to iterate throught the series, using the index
provided by enumerate at as the series/row index. Use iloc!!!!

In [None]:
columns = list(tesla.columns)
cap_pos = columns.index('mkt_cap')
close_pos = columns.index('Adj. Close')
for i, year in enumerate(tesla['Year']): #enumerate will return the year in the row index of interest.
#for the given year in the specified row, lookup the cells in the dictionary and assign the corresponding share values.    
    shares = TSLA_shares.get(year)
    tesla.iloc[i,cap_pos] = shares * tesla.iloc[i,close_pos]

###### <p><a name="numpy"></a></p>
## Numpy

###### <p><a name="Random"></a></p>
### Random Generators

**Random Number Generators** random sampling using **`random.choice()`**
```Python
    np.random.choice(int or array type,size,replace = True)
    ```
Selects numbers from the array type (or if an int is passed, the int is cast into arange(int))
and returns the selection in the form of 'size.' replace = False if I do NOT want replacement.

In [1]:
import numpy as np

In [7]:
np.random.choice(3,size = None, replace = False)

0

In [106]:
np.random.choice(range(10000,99999), size = 7, replace = False)

array([25745, 25473, 73924, 15800, 15928, 37378, 17360])

**Standard Normal Distribution** random sampling using **`random.randn()`**
```Python
    np.random.randn(dimension0, dimension1, dimesion2, dimeonsionN)
    ```
gives floats from a **standard normal/Gaussian** distribution (of mean 0 and variance 1), of the provided dimensions

In [4]:
a = np.random.randn(10)
print(a)
a.shape

[ 0.26508491  0.5691604  -2.08971275 -0.30662213  2.6151798  -0.57662711
  1.03056508 -1.60620426  0.0376363   0.30862052]


(10,)

In [108]:
np.random.randn(3,2)

array([[-0.55785498,  0.77138124],
       [ 0.09456126, -1.7406014 ],
       [-0.37980837,  1.83938546]])

In [5]:
np.random.randn(4,2)

array([[-0.37333746, -1.89273096],
       [ 0.0675416 ,  0.28361577],
       [ 0.24800602,  1.32341884],
       [-1.54685508, -2.93272528]])

**Uniform [0,1] Distribution** random sampling from **`random.rand()`**
```Python
    np.random.rand(dimension0, dimension1, dimesion2, dimeonsionn)
    ```
gives floats from a **uniform** distribution from 0 to 1, of the provided dimensions

In [110]:
np.random.rand(3,2)

array([[0.10179232, 0.30732347],
       [0.5074515 , 0.84870062],
       [0.80980983, 0.26996448]])

In [15]:
np.random.rand(10) #flat array

array([0.03529908, 0.65505208, 0.99393536, 0.52434808, 0.81324903,
       0.5750699 , 0.34892303, 0.43407095, 0.00500782, 0.79429364])

In [23]:
#X = np.random.rand(10)
print(type(X))
#print(X)
sorted(X)
#X = np.array(sorted(X))
#print(type(X))
type(X)

<class 'numpy.ndarray'>


numpy.ndarray

**Specified Range** random sampling using **`random.randrange()`** 
```Python
    random.randrange(start, stop, step)
    ```
gives integer from a list of the provided ranges.

In [24]:
from random import randrange
for listSize in range(5,26,5):
    mylist = [randrange(10) for x in range(listSize)]
    print(mylist, len(mylist))

[0, 6, 1, 2, 2] 5
[1, 6, 2, 4, 4, 1, 3, 2, 1, 1] 10
[4, 6, 6, 9, 2, 4, 4, 9, 6, 0, 7, 0, 8, 6, 5] 15
[5, 1, 8, 9, 6, 1, 1, 0, 9, 6, 3, 3, 2, 3, 4, 3, 3, 2, 1, 5] 20
[3, 7, 4, 9, 4, 8, 4, 7, 3, 1, 3, 3, 8, 1, 7, 5, 4, 0, 4, 2, 4, 6, 4, 5, 3] 25


In [9]:
[randrange(10) for x in range(5)]

[2, 2, 8, 3, 2]

In [20]:
randrange(5,26,5)

15

++++++++++++++**PLACEHOLDER TITLE BAR FOR OTHER FUNCTIONS** ++++++++++++++

Create a new array **fillled with ones (1s)** with `np.ones()`
```Python
   np.ones(shape:(2,3) or just 2, dtype, row_major(C) or column_major(F) in memory)
    ```

In [5]:
np.ones((5,4),float)

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [17]:
ones = np.ones((6,2))
print(ones)

[[1. 1.]
 [1. 1.]
 [1. 1.]
 [1. 1.]
 [1. 1.]
 [1. 1.]]


Create a new array **fillled with zeros (0s)** with `np.zeros_like()`
```Python
   np.zeros_like(existing ndimensional array)
    ```

In [16]:
np.zeros_like(ones)

array([[0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.]])

**Append to an array** `np.append()`
```Python
   np.append(array,item)
    ```

In [8]:
testarray = np.random.randn(10)

In [9]:
np.append(testarray,100)

array([ -1.026232  ,   0.63232232,   0.47562479,  -0.65869988,
        -0.43056979,   0.74013052,  -0.63687727,  -0.43816709,
         0.84109724,   0.49085495, 100.        ])

In [10]:
np.dot?

In [21]:
a = [[1, 0], [0, 1]]
a = np.array(a)
a.shape

(2, 2)

###### <p><a name="numspace"></a></p>
### Arrays with numbers across a specified interval or distance

**Even spaced numbers** an array of evenly spaced numbers generated by `np.linspace()`
```Python
   np.linspace(start,stop,num_samples)
    ```

In [11]:
np.linspace(1,10,4, axis = 0)

array([ 1.,  4.,  7., 10.])

In [12]:
np.linspace(-5,5,20)

array([-5.        , -4.47368421, -3.94736842, -3.42105263, -2.89473684,
       -2.36842105, -1.84210526, -1.31578947, -0.78947368, -0.26315789,
        0.26315789,  0.78947368,  1.31578947,  1.84210526,  2.36842105,
        2.89473684,  3.42105263,  3.94736842,  4.47368421,  5.        ])

**Creating a DataFrame with Random Values** using random sampling to make a DataFrame

In [99]:
import pandas as pd

In [111]:
rand_generator = lambda min_,max_,size_: np.random.choice(range(min_,max_,),size = size_, replace = False)

In [113]:
data = pd.DataFrame({'Facebook_ID':list(rand_generator(10000,99999,7)),\
                     'action_ID':list(rand_generator(100,9999,7)),\
                     'type': random.choices(['Post','Comment','Photo','Share'], k =7)})
data

Unnamed: 0,Facebook_ID,action_ID,type
0,15760,682,Photo
1,51971,323,Photo
2,44053,5501,Post
3,64467,6472,Post
4,45663,936,Comment
5,30852,1872,Share
6,68974,1382,Share


In [115]:
data1 = pd.DataFrame({'referred_ID': [555,898,3453,2215,5501,234,5501]})
data = pd.concat([data,data1], axis = 1, sort = False)
data

Unnamed: 0,Facebook_ID,action_ID,type,referred_ID
0,15760,682,Photo,555
1,51971,323,Photo,898
2,44053,5501,Post,3453
3,64467,6472,Post,2215
4,45663,936,Comment,5501
5,30852,1872,Share,234
6,68974,1382,Share,5501


In [100]:
help(sample)

NameError: name 'sample' is not defined

In [85]:
dir(np.random)

['Lock',
 'RandomState',
 '__RandomState_ctor',
 '__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 'absolute_import',
 'beta',
 'binomial',
 'bytes',
 'chisquare',
 'choice',
 'dirichlet',
 'division',
 'exponential',
 'f',
 'gamma',
 'geometric',
 'get_state',
 'gumbel',
 'hypergeometric',
 'laplace',
 'logistic',
 'lognormal',
 'logseries',
 'mtrand',
 'multinomial',
 'multivariate_normal',
 'negative_binomial',
 'noncentral_chisquare',
 'noncentral_f',
 'normal',
 'np',
 'operator',
 'pareto',
 'permutation',
 'poisson',
 'power',
 'print_function',
 'rand',
 'randint',
 'randn',
 'random',
 'random_integers',
 'random_sample',
 'ranf',
 'rayleigh',
 'sample',
 'seed',
 'set_state',
 'shuffle',
 'standard_cauchy',
 'standard_exponential',
 'standard_gamma',
 'standard_normal',
 'standard_t',
 'test',
 'triangular',
 'uniform',
 'vonmises',
 'wald',
 'weibull',
 'zipf']

In [89]:
np.arange(3)

array([0, 1, 2])

In [109]:
help(np.random.rand)

Help on built-in function rand:

rand(...) method of mtrand.RandomState instance
    rand(d0, d1, ..., dn)
    
    Random values in a given shape.
    
    Create an array of the given shape and populate it with
    random samples from a uniform distribution
    over ``[0, 1)``.
    
    Parameters
    ----------
    d0, d1, ..., dn : int, optional
        The dimensions of the returned array, should all be positive.
        If no argument is given a single Python float is returned.
    
    Returns
    -------
    out : ndarray, shape ``(d0, d1, ..., dn)``
        Random values.
    
    See Also
    --------
    random
    
    Notes
    -----
    This is a convenience function. If you want an interface that
    takes a shape-tuple as the first argument, refer to
    np.random.random_sample .
    
    Examples
    --------
    >>> np.random.rand(3,2)
    array([[ 0.14022471,  0.96360618],  #random
           [ 0.37601032,  0.25528411],  #random
           [ 0.49313049,  0.94909878]]

In [44]:
help(sorted)

Help on built-in function sorted in module builtins:

sorted(iterable, /, *, key=None, reverse=False)
    Return a new list containing all items from the iterable in ascending order.
    
    A custom key function can be supplied to customize the sort order, and the
    reverse flag can be set to request the result in descending order.



### <p><a name="machine learning"></a></p>
# II. Machine Learning

In [None]:


#Train test split
X_train, X_test, y_train, y_test = ms.train_test_split(train, target, test_size=0.25, random_state = 917)

### Vanilla Ridge

In [None]:
ridge = linear_model.Ridge(normalize = False) #False
ridge = ridge.set_params(random_state = 90)

# Train vanilla model
ridge.fit(X_train, y_train)

In [None]:
#getting a prediction using ridge, and then calculating RMSE by comparing test vs. prediction
ridge_result = ridge.predict(X_test)

#

print('Mean Squared Error (MSE):', mean_squared_error(y_test, ridge_result))
print('Root Mean Square Error (RMSE):', np.sqrt(mean_squared_error(y_test, ridge_result)))

### Grid Search Ridge

In [None]:
parameters = {'alpha': np.linspace(0.1,100,200)}

ridge = Ridge()
ridge.set_params(normalize = False)
ridge_model = ms.GridSearchCV(estimator = ridge, 
                              param_grid = parameters, 
                              scoring = 'neg_mean_squared_error', 
                              cv = 5, verbose = True, n_jobs = -1,
                              return_train_score = True)
ridge_model = ridge_model.fit(train,target) #ridge_model.fit(X_train,y_train)
%time ridge_model

In [None]:
print(ridge_model.best_params_)

In [None]:
#save the best hyperparameters
ridge_best = ridge_model.best_estimator_

#fit using the best hyperparameters
ridge_best.fit(train,target)

#get the score using the best hyperparamters
ridge_best.score(train,target)