# Libraries in Python

- Python Library is a collection of pre-written code containing functions and modules that allows you to perform many actions without writting your code.

# NumPy

- NumPy stands for **Numerical Python** and is the core library for numeric and scientific computing.

- Lists serve the purpose of arrays, but they are slow (*NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently*.)

- NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.

- the array object in NumPy is called **ndarray**

## Why NumPy arrays or ndarray are faster as compare to Python Lists?

Numpy arrays are faster compared to Python lists due to several reasons. 

1. **Memory Management**: 
   - Numpy arrays are stored in a contiguous block of memory, which allows for efficient caching and vectorized operations, whereas Python lists store references to objects at various locations in memory.
   - This difference leads to better cache utilization and reduces the overhead of memory management.

   ```python
   # Memory layout of a Numpy array
   import numpy as np
   arr = np.array([1, 2, 3])
   print(arr.data)
   ```

2. **Vectorized Operations**:
   - Numpy operations are implemented in C, which allows for vectorized operations, where operations are performed on multiple elements at once.
   - Python lists require explicit iteration, which can be slower.

   ```python
   # Vectorized addition with Numpy
   arr1 = np.array([1, 2, 3])
   arr2 = np.array([4, 5, 6])
   result = arr1 + arr2
   ```

3. **Data Types and Homogeneity**:
   - Numpy arrays are homogeneous, meaning all elements have the same data type, which allows for better optimization.
   - Python lists can contain different data types, leading to additional type checking and overhead.

   ```python
   # Homogeneous data type in Numpy
   arr = np.array([1, 2, 3], dtype=np.int32)
   ```

4. **Underlying Implementation**:
   - Numpy arrays are implemented in C, which is a lower-level language, providing better performance compared to Python lists, which are high-level and interpreted.

   ```python
   # Underlying implementation of Numpy arrays
   print(type(arr).__module__)
   ```

5. **Multi-dimensional Data**:
   - Numpy arrays can efficiently handle multi-dimensional data and operations, which is not as straightforward with Python lists.

   ```python
   # Multi-dimensional array in Numpy
   arr = np.array([[1, 2, 3], [4, 5, 6]])
   ```

Numpy arrays are faster than Python lists due to efficient memory management, vectorized operations, homogeneous data types, lower-level implementation, and support for multi-dimensional data.  

In [1]:
# step 1 : import librraries
import numpy as np

In [2]:
# 0D - array or scalar values

arr_0d = np.array(42)

print(arr_0d)

42


In [3]:
print(arr_0d.ndim) # Print the dimension of array

0


In [4]:
# 1D - array or vector

arr_1d = np.array([1, 2, 3, 4, 5])

print(arr_1d)

[1 2 3 4 5]


In [5]:
print(arr_1d.ndim)

1


In [6]:
# 2D - array 

arr_2d = np.array([[1, 2, 3], [4, 5, 6]])

print(arr_2d)

[[1 2 3]
 [4 5 6]]


In [7]:
arr_2d[0][0]

1

In [8]:
arr_2d[1][0]

4

In [9]:
print(arr_2d.ndim)

2


In [10]:
# 3D - array

arr_3d = np.array([[[1, 2, 3], 
                  [4, 5, 6], 
                 [7, 8, 9]]])

print(arr_3d)

[[[1 2 3]
  [4 5 6]
  [7 8 9]]]


In [11]:
print(arr_3d.ndim)

3


In [12]:
arr_3d[0][0][0]

1

In [13]:
arr_3d[0][0][1]

2

In [14]:
arr_3d[0][1][1]

5

In [15]:
arr_3d[0][2][2]

9

In [16]:
n1 = np.zeros((5, 5))

In [17]:
n1

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [18]:
n1 = np.full((5, 5), 'a') # create a matrix of a specific element here element is 'a'

In [19]:
n1

array([['a', 'a', 'a', 'a', 'a'],
       ['a', 'a', 'a', 'a', 'a'],
       ['a', 'a', 'a', 'a', 'a'],
       ['a', 'a', 'a', 'a', 'a'],
       ['a', 'a', 'a', 'a', 'a']], dtype='<U1')

In [20]:
print(n1.dtype)

<U1


In [21]:
n1 = np.full((5, 5), 10) # create a matrix of 5 x 5 with element 10

In [22]:
n1

array([[10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10]])

In [23]:
np.arange(50, 100)

array([50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,
       67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,
       84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])

In [24]:
np.arange(10, 20)

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

In [25]:
n1 = np.array([1, 2, 3, 4, 5])

In [26]:
print(n1.dtype)

int32


In [27]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr)
print(arr.shape)

[[1 2 3 4]
 [5 6 7 8]]
(2, 4)


In [28]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

In [29]:
print(arr[1:5])

[2 3 4 5]


In [30]:
print(arr[-3:-1])

[5 6]


In [31]:
print(arr[1:5:2])

[2 4]


In [32]:
nm = np.array([[[1, 2, 3], [4, 5, 6]]])
print(nm)
print(nm.ndim)

[[[1 2 3]
  [4 5 6]]]
3


In [33]:
nm2 = np.array([[[1, 2, 3]]])
print(nm2)
print(nm2.ndim)

[[[1 2 3]]]
3


In [34]:
print(np.shape(nm))
# rows, columns = np.shape(nm)
# print(f"rows: {rows}, columns: {columns}")

(1, 2, 3)


In [35]:
arr1r3c = np.array([[1, 2, 3]])
rows, columns = np.shape(arr1r3c)
print(f"rows: {rows}, columns: {columns}")

rows: 1, columns: 3


In [36]:
arr = np.array([1, 2, 3])

for x in arr:
    print(x)

1
2
3


In [37]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

In [38]:
arr

array([[1, 2, 3],
       [4, 5, 6]])

In [39]:
for x in arr:
    print(x)

[1 2 3]
[4 5 6]


In [40]:
n1 = np.array([10, 20, 30])
n2 = np.array([40, 50, 60])

In [41]:
np.vstack((n1, n2))

array([[10, 20, 30],
       [40, 50, 60]])

In [42]:
np.hstack((n1, n2))

array([10, 20, 30, 40, 50, 60])

In [43]:
np.column_stack((n1, n2))

array([[10, 40],
       [20, 50],
       [30, 60]])

In [44]:
np.concatenate((n1, n2))

array([10, 20, 30, 40, 50, 60])

In [45]:
arr1 = np.array([10, 20, 30, 40, 50, 60])

arr2 = np.array([40, 50, 60, 70, 80, 90])

In [46]:
np.intersect1d(arr1, arr2)

array([40, 50, 60])

In [47]:
np.setdiff1d(arr1, arr2)

array([10, 20, 30])

In [48]:
np.setdiff1d(arr2, arr1)

array([70, 80, 90])

In [49]:
n1 = np.array([10, 20, 30])

In [50]:
n1+5

array([15, 25, 35])

In [51]:
n1*10

array([100, 200, 300])

In [52]:
n1/10

array([1., 2., 3.])

In [53]:
n1-6

array([ 4, 14, 24])

In [54]:
n1 = np.array([10, 20, 30, 40, 50, 60])

In [55]:
np.mean(n1)

35.0

In [56]:
np.median(n1)

35.0

In [57]:
np.std(n1)

17.07825127659933

In [58]:
arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

In [59]:
newarr

[array([1, 2]), array([3, 4]), array([5, 6])]

In [60]:
i = 1
for x in newarr:
    print(f'Array {i} : {x}')
    i += 1

Array 1 : [1 2]
Array 2 : [3 4]
Array 3 : [5 6]


In [61]:
arr = np.array([1, 2, 3, 4, 5, 4, 4])

x = np.where(arr == 4)

print(x)

(array([3, 5, 6], dtype=int64),)


In [62]:
arr = np.array([3, 2, 0, 1])

print(np.sort(arr))

[0 1 2 3]


In [63]:
arr = np.array([41, 42, 43, 44])

In [64]:
newarr = arr[arr > 42]

In [65]:
newarr

array([43, 44])

# Pandas



In [66]:
import pandas as pd

In [67]:
s = pd.Series([1, 2, 3])

In [68]:
s

0    1
1    2
2    3
dtype: int64

In [69]:
s[0] # pandas provide a indexing for each element on single column or column-vector.

1

In [70]:
s[1]

2

In [71]:
s[2]

3

In [72]:
a = pd.Series([1, 2, 3], index=['x', 'y', 'z'])

In [73]:
a # User-defined Indexing feature! 

x    1
y    2
z    3
dtype: int64

In [74]:
calories = {"day1": 420, "day2": 380, "day3": 390}

In [75]:
dic = pd.Series(calories)  

In [76]:
dic

day1    420
day2    380
day3    390
dtype: int64

In [77]:
dic2 = pd.Series({"day1": 420, "day2": 380, "day3": 390}, index=["day1", "day2", "day3", "day4"])
dic2

day1    420.0
day2    380.0
day3    390.0
day4      NaN
dtype: float64

In [78]:
s1 = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

In [79]:
s1 + 5

0     6
1     7
2     8
3     9
4    10
5    11
6    12
7    13
8    14
9    15
dtype: int64

In [80]:
s1 - 1

0    0
1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9
dtype: int64

In [81]:
s1 * 10

0     10
1     20
2     30
3     40
4     50
5     60
6     70
7     80
8     90
9    100
dtype: int64

In [82]:
s1 / 10

0    0.1
1    0.2
2    0.3
3    0.4
4    0.5
5    0.6
6    0.7
7    0.8
8    0.9
9    1.0
dtype: float64

In [83]:
s1 = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9])

s2 = pd.Series([10, 20, 30, 40, 50, 60, 70, 80, 90])

s1+s2

0    11
1    22
2    33
3    44
4    55
5    66
6    77
7    88
8    99
dtype: int64

In [84]:
df = pd.DataFrame()
print(df)

Empty DataFrame
Columns: []
Index: []


In [85]:
df = pd.DataFrame([1, 2, 3, 4, 5])
print(df)

   0
0  1
1  2
2  3
3  4
4  5


In [86]:
data = [['Alex', 10, '10'], ['Bob', 12], ['Clarke', 13]]
df = pd.DataFrame(data)

In [87]:
df

Unnamed: 0,0,1,2
0,Alex,10,10.0
1,Bob,12,
2,Clarke,13,


In [88]:
df = pd.DataFrame(data, columns = ['Name', 'Age', 'Marks'])

In [89]:
df

Unnamed: 0,Name,Age,Marks
0,Alex,10,10.0
1,Bob,12,
2,Clarke,13,


In [90]:
data = {
    'Calories': [420, 380, 390],
    'Duration': [50, 40, 45]
}

In [91]:
n = pd.DataFrame(data)

In [92]:
n

Unnamed: 0,Calories,Duration
0,420,50
1,380,40
2,390,45


In [93]:
d = {'one': pd.Series([1, 2, 3], index=['a', 'b', 'c']),
    'two': pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])}

In [96]:
z=pd.DataFrame(d)

In [97]:
z

Unnamed: 0,one,two
a,1.0,1
b,2.0,2
c,3.0,3
d,,4


In [98]:
d = {'one': pd.Series([1, 2, 3]),
    'two': pd.Series([1, 2, 3, 4])}

In [99]:
z=pd.DataFrame(d)

In [100]:
z

Unnamed: 0,one,two
0,1.0,1
1,2.0,2
2,3.0,3
3,,4


In [102]:
sharktank_df = pd.read_csv(r'dataset/Shark_Tank.csv')

In [103]:
sharktank_df.head()

Unnamed: 0,episode_number,pitch_number,brand_name,idea,deal,pitcher_ask_amount,ask_equity,ask_valuation,deal_amount,deal_equity,...,ashneer_deal,anupam_deal,aman_deal,namita_deal,vineeta_deal,peyush_deal,ghazal_deal,total_sharks_invested,amount_per_shark,equity_per_shark
0,1,1,BluePine Industries,Frozen Momos,1,50.0,5.0,1000.0,75.0,16.0,...,1,0,1,0,1,0,0,3,25.0,5.333333
1,1,2,Booz scooters,Renting e-bike for mobility in private spaces,1,40.0,15.0,266.67,40.0,50.0,...,1,0,0,0,1,0,0,2,20.0,25.0
2,1,3,Heart up my Sleeves,Detachable Sleeves,1,25.0,10.0,250.0,25.0,30.0,...,0,1,0,0,1,0,0,2,12.5,15.0
3,2,4,Tagz Foods,Healthy Potato Chips,1,70.0,1.0,7000.0,70.0,2.75,...,1,0,0,0,0,0,0,1,70.0,2.75
4,2,5,Head and Heart,Brain Development Course,0,50.0,5.0,1000.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0


In [105]:
sharktank_df.head(20)

Unnamed: 0,episode_number,pitch_number,brand_name,idea,deal,pitcher_ask_amount,ask_equity,ask_valuation,deal_amount,deal_equity,...,ashneer_deal,anupam_deal,aman_deal,namita_deal,vineeta_deal,peyush_deal,ghazal_deal,total_sharks_invested,amount_per_shark,equity_per_shark
0,1,1,BluePine Industries,Frozen Momos,1,50.0,5.0,1000.0,75.0,16.0,...,1,0,1,0,1,0,0,3,25.0,5.333333
1,1,2,Booz scooters,Renting e-bike for mobility in private spaces,1,40.0,15.0,266.67,40.0,50.0,...,1,0,0,0,1,0,0,2,20.0,25.0
2,1,3,Heart up my Sleeves,Detachable Sleeves,1,25.0,10.0,250.0,25.0,30.0,...,0,1,0,0,1,0,0,2,12.5,15.0
3,2,4,Tagz Foods,Healthy Potato Chips,1,70.0,1.0,7000.0,70.0,2.75,...,1,0,0,0,0,0,0,1,70.0,2.75
4,2,5,Head and Heart,Brain Development Course,0,50.0,5.0,1000.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0
5,2,6,Agro tourism,Tourism,0,50.0,5.0,1000.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0
6,3,7,Qzense Labs,Food Freshness Detector,0,100.0,0.25,40000.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0
7,3,8,Peeschute,Disposable Urine Bag,1,75.0,4.0,1875.0,75.0,6.0,...,0,0,1,0,0,0,0,1,75.0,6.0
8,3,9,NOCD,Energy Drink,1,50.0,2.0,2500.0,20.0,15.0,...,0,0,0,0,1,0,0,1,20.0,15.0
9,4,10,Cosiq,Intelligent Skincare,1,50.0,7.5,666.67,50.0,25.0,...,0,1,0,0,1,0,0,2,25.0,12.5


In [104]:
sharktank_df.tail()

Unnamed: 0,episode_number,pitch_number,brand_name,idea,deal,pitcher_ask_amount,ask_equity,ask_valuation,deal_amount,deal_equity,...,ashneer_deal,anupam_deal,aman_deal,namita_deal,vineeta_deal,peyush_deal,ghazal_deal,total_sharks_invested,amount_per_shark,equity_per_shark
112,34,113,Green Protein,Plant-Based Protein,0,60.0,2.0,3000.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0
113,34,114,On2Cook,Fastest Cooking Device,0,100.0,1.0,10000.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0
114,35,115,Jain Shikanji,Lemonade,1,40.0,8.0,500.0,40.0,30.0,...,1,1,1,0,1,0,0,4,10.0,7.5
115,35,116,Woloo,Washroom Finder,0,50.0,4.0,1250.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0
116,35,117,Elcare India,Carenting for Elders,0,100.0,2.5,4000.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0


In [106]:
sharktank_df.tail(10)

Unnamed: 0,episode_number,pitch_number,brand_name,idea,deal,pitcher_ask_amount,ask_equity,ask_valuation,deal_amount,deal_equity,...,ashneer_deal,anupam_deal,aman_deal,namita_deal,vineeta_deal,peyush_deal,ghazal_deal,total_sharks_invested,amount_per_shark,equity_per_shark
107,33,108,Mavi's,Vegan Fermented Food,0,40.0,5.0,800.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0
108,33,109,Tweek Labs,Sportswear,1,40.0,2.0,2000.0,60.0,10.0,...,1,1,0,0,0,1,0,3,20.0,3.333333
109,33,110,Proxgy,VR,1,35.0,1.0,3500.0,10.0,10.0,...,1,0,0,0,0,1,0,2,5.0,5.0
110,34,111,Nomad Food Project,Bacon Jams,1,40.0,10.0,400.0,40.0,20.0,...,1,0,0,1,1,0,1,4,10.0,5.0
111,34,112,Twee in One,Reversible and convertible clothing,0,30.0,7.5,400.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0
112,34,113,Green Protein,Plant-Based Protein,0,60.0,2.0,3000.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0
113,34,114,On2Cook,Fastest Cooking Device,0,100.0,1.0,10000.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0
114,35,115,Jain Shikanji,Lemonade,1,40.0,8.0,500.0,40.0,30.0,...,1,1,1,0,1,0,0,4,10.0,7.5
115,35,116,Woloo,Washroom Finder,0,50.0,4.0,1250.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0
116,35,117,Elcare India,Carenting for Elders,0,100.0,2.5,4000.0,0.0,0.0,...,0,0,0,0,0,0,0,0,0.0,0.0


In [107]:
sharktank_df.shape

(117, 28)

In [108]:
sharktank_df.describe()

Unnamed: 0,episode_number,pitch_number,deal,pitcher_ask_amount,ask_equity,ask_valuation,deal_amount,deal_equity,deal_valuation,ashneer_present,...,ashneer_deal,anupam_deal,aman_deal,namita_deal,vineeta_deal,peyush_deal,ghazal_deal,total_sharks_invested,amount_per_shark,equity_per_shark
count,117.0,117.0,117.0,117.0,117.0,117.0,117.0,117.0,117.0,117.0,...,117.0,117.0,117.0,117.0,117.0,117.0,117.0,117.0,117.0,117.0
mean,18.735043,59.0,0.555556,319.854709,5.188034,3852.462479,31.982915,8.963504,467.104872,0.837607,...,0.179487,0.205128,0.239316,0.188034,0.128205,0.230769,0.059829,1.230769,18.132481,5.58359
std,10.070778,33.919021,0.499041,2767.842777,3.892121,11931.601957,36.687391,13.106769,919.988864,0.370397,...,0.38541,0.405532,0.428501,0.39242,0.335756,0.423137,0.23819,1.410457,23.588682,10.803799
min,1.0,1.0,0.0,0.00101,0.25,0.01,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,10.0,30.0,0.0,45.0,2.5,666.67,0.0,0.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,19.0,59.0,1.0,50.0,5.0,1250.0,25.0,3.0,100.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,10.0,1.25
75%,27.0,88.0,1.0,80.0,7.5,2857.14,50.0,15.0,500.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,25.0,6.0
max,35.0,117.0,1.0,30000.0,25.0,120000.0,150.0,75.0,6666.67,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,5.0,100.0,75.0


In [109]:
sharktank_df.dtypes

episode_number             int64
pitch_number               int64
brand_name                object
idea                      object
deal                       int64
pitcher_ask_amount       float64
ask_equity               float64
ask_valuation            float64
deal_amount              float64
deal_equity              float64
deal_valuation           float64
ashneer_present            int64
anupam_present             int64
aman_present               int64
namita_present             int64
vineeta_present            int64
peyush_present             int64
ghazal_present             int64
ashneer_deal               int64
anupam_deal                int64
aman_deal                  int64
namita_deal                int64
vineeta_deal               int64
peyush_deal                int64
ghazal_deal                int64
total_sharks_invested      int64
amount_per_shark         float64
equity_per_shark         float64
dtype: object

In [110]:
sharktank_df.columns

Index(['episode_number', 'pitch_number', 'brand_name', 'idea', 'deal',
       'pitcher_ask_amount', 'ask_equity', 'ask_valuation', 'deal_amount',
       'deal_equity', 'deal_valuation', 'ashneer_present', 'anupam_present',
       'aman_present', 'namita_present', 'vineeta_present', 'peyush_present',
       'ghazal_present', 'ashneer_deal', 'anupam_deal', 'aman_deal',
       'namita_deal', 'vineeta_deal', 'peyush_deal', 'ghazal_deal',
       'total_sharks_invested', 'amount_per_shark', 'equity_per_shark'],
      dtype='object')

In [111]:
# loc and iloc

df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
                   'points': [5, 7, 7, 9, 12, 9, 9, 4],
                   'assists': [11, 8, 10, 6, 6, 5, 9, 12]}, 
                 index = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'])

In [112]:
df

Unnamed: 0,team,points,assists
A,A,5,11
B,A,7,8
C,A,7,10
D,A,9,6
E,B,12,6
F,B,9,5
G,B,9,9
H,B,4,12


In [113]:
df.loc[['E', 'F']] # For loc -> Give Name of the location

Unnamed: 0,team,points,assists
E,B,12,6
F,B,9,5


In [114]:
df.iloc[4:6] # For iloc -> Give the index of location

Unnamed: 0,team,points,assists
E,B,12,6
F,B,9,5


In [115]:
s = pd.Series(list('abcdef'), index=[49, 48, 47, 0, 1, 2])

In [116]:
s

49    a
48    b
47    c
0     d
1     e
2     f
dtype: object

In [117]:
s.loc[0]

'd'

In [118]:
s.iloc[0]

'a'