## Familiar Functions (Python Built-in Functions)

+ print() : prints the values to a stream
+ type() : returns data type of the variable
+ str() : creates a new string object from the given object 
+ int() : creates a new integer object
+ bool() : creates a new boolean object
+ float() : : creates a new float object

In [1]:
var1 = [1, 2, 3, 4]

In [2]:
type(var1)

list

In [3]:
len(var1)

4

In [4]:
var2 = True

In [5]:
var2

True

In [6]:
int(var2)

1

### How to get information for a particular built-in function

In [7]:
help(type)

Help on class type in module builtins:

class type(object)
 |  type(object_or_name, bases, dict)
 |  type(object) -> the object's type
 |  type(name, bases, dict) -> a new type
 |  
 |  Methods defined here:
 |  
 |  __call__(self, /, *args, **kwargs)
 |      Call self as a function.
 |  
 |  __delattr__(self, name, /)
 |      Implement delattr(self, name).
 |  
 |  __dir__(...)
 |      __dir__() -> list
 |      specialized __dir__ implementation for types
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __instancecheck__(...)
 |      __instancecheck__() -> bool
 |      check if an object is an instance
 |  
 |  __new__(*args, **kwargs)
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  __prepare__(...)
 |      __prepare__() -> dict
 |      used to create the namespace for the class statement
 |  
 

In [8]:
?type

or press shift+tab together after typing the function name

## Methods

Methods are the actions that are taken by an object of a class. For instance, an object of a class <b>car</b> can take actions like <b>drive</b>, <b>accelerate</b>, <b>stop</b> etc. So every built in class in Python comes with few useful built-in methods

#### Built-in methods of class String

In [9]:
# tabulating all the String methods
help(str)

Help on class str in module builtins:

class str(object)
 |  str(object='') -> str
 |  str(bytes_or_buffer[, encoding[, errors]]) -> str
 |  
 |  Create a new string object from the given object. If encoding or
 |  errors is specified, then the object must expose a data buffer
 |  that will be decoded using the given encoding and error handler.
 |  Otherwise, returns the result of object.__str__() (if defined)
 |  or repr(object).
 |  encoding defaults to sys.getdefaultencoding().
 |  errors defaults to 'strict'.
 |  
 |  Methods defined here:
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __format__(...)
 |      S.__format__(format_spec) -> str
 |      
 |      Return a formatted version of S as described by format_spec.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getatt

In [10]:
city = "Gurgaon"

In [11]:
city.upper()

'GURGAON'

In [12]:
city.count('o')

1

#### Built-in methods for class List

In [13]:
heights = [171, 169, 167, 173, 167]

In [14]:
heights.index(167) # returns index of the first element that matches its input

2

In [15]:
heights.count(167) # returns number of times an element appears in a list

2

In [16]:
heights.append(173) # adds an element at the end of the list
heights

[171, 169, 167, 173, 167, 173]

In [17]:
heights.remove(173) # removes the first element of a list that matches the input
heights

[171, 169, 167, 167, 173]

In [18]:
heights.reverse() # reverses the order of the elements in the list it is called on
heights

[173, 167, 167, 169, 171]

## Packages

A package is a collection of Python modules (and module is a single python script file) that helps to solve a particular problem. Some examples of packages are

+ math
+ numpy
+ pandas
+ matplotlib

Suppose, you want to find the circumference and area of a circle given the radius. It is easier to use python packages instead of writing everything from scratch.

In [19]:
radius = 0.4

In [20]:
import math

In [21]:
circumference = 2 * math.pi * radius
circumference

2.5132741228718345

In [22]:
area = math.pi * radius ** 2
area

0.5026548245743669

## NumPy

Why NumPy? Why not use Lists? As a Data Scientist one needs to perform huge number of mathematical operations over a collection of data points at a very fast speed.

#### calculation using list

In [25]:
heights = [1.71, 1.69, 1.67, 1.73, 1.67] # in m
weights = [58, 62, 74, 75, 68] # in KG

let's calculate BMI for each person i.e. <b>BMI = weight / height^2<b>

In [27]:
BMI = []
for i in range(len(heights)):
    bmi = weights[i] / heights[i] ** 2
    BMI.append(bmi)

In [28]:
BMI

[19.835162956123252,
 21.707923392038097,
 26.533758829646096,
 25.05930702662969,
 24.38237297859371]

#### calculation using NumPy array

In [29]:
import numpy as np

In [30]:
# first convert the lists into a numpy array
np_heights = np.array(heights)
np_weights = np.array(weights)

print('Python list for heights', heights)
print('NumPy array of heights:', np_heights)

Python list for heights [1.71, 1.69, 1.67, 1.73, 1.67]
NumPy array of heights: [1.71 1.69 1.67 1.73 1.67]


In [33]:
bmi = np_weights / np_heights ** 2
bmi

array([19.83516296, 21.70792339, 26.53375883, 25.05930703, 24.38237298])

As you can see, we performed the whole mathematical operation in just one line of code without using any for loops. This is also known as <b>vectorization</b> or <b>element wise calculations</b>

Let's try the same operation using Python list and see the error

In [34]:
bmi = weights / heights ** 2

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

<b>Points to Remember:</b> Unlike Python lists, NumPy arrays can store elements of only single data type. So we need to be careful when we use NumPy arrays in mathematical operations, the array should contain elements of single data type. For instance

In [35]:
# let's create a Python list containing different data types
diff_types = [1.0, 'is', True] 

# convert the above list into NumPy array
np.array(diff_types)

array(['1.0', 'is', 'True'], dtype='<U32')

What did you notice?

#### NumPy Subsetting

In [38]:
bmi

array([19.83516296, 21.70792339, 26.53375883, 25.05930703, 24.38237298])

In [39]:
# slicing using indices
bmi[1]

21.707923392038097

In [40]:
bmi[:3]

array([19.83516296, 21.70792339, 26.53375883])

There is also a different way of subsetting NumPy arrays, which is using NumPy Boolean arrays. For example, in the above <b>bmi</b> NumPy arrays, I need to find out those values which are greater than 25

In [42]:
# create a boolean mask
high_bmi = bmi > 25
high_bmi

array([False, False,  True,  True, False])

In [43]:
bmi[high_bmi]

array([26.53375883, 25.05930703])

#### Some Disadvantages of NumPy arrays

+ NumPy arrays cannot contain elements with different types. If you try to build such a list, some of the elements' types are changed to end up with a homogeneous list. This is known as <b>type coercion</b>.

+ The typical arithmetic operators, such as <b>+, -, *</b> and <b>/</b> have a different meaning for regular Python lists and NumPy arrays. For example

In [44]:
[1, 2, 3] + [4, 5, 6]

[1, 2, 3, 4, 5, 6]

In [45]:
np.array([1, 2, 3]) + np.array([4, 5, 6])

array([5, 7, 9])

So everything, a NumPy array does is in the form of vectorization or element wise operations

#### Types of NumPy arrays

+ 1-D array
+ 2-D array
+ .
+ .
+ . 
+ N-D array

In [46]:
bmi

array([19.83516296, 21.70792339, 26.53375883, 25.05930703, 24.38237298])

In [47]:
type(bmi)

numpy.ndarray

In [52]:
# let's create a 2-D array
np_2d = np.array([heights, weights])
np_2d

array([[ 1.71,  1.69,  1.67,  1.73,  1.67],
       [58.  , 62.  , 74.  , 75.  , 68.  ]])

In [53]:
np_2d.shape

(2, 5)

#### Loading N-D NumPy arrays

![title](Lenna.png)

In [56]:
import matplotlib.pyplot as plt

img = plt.imread('Lenna.png')

In [57]:
img

array([[[0.8862745 , 0.5372549 , 0.4862745 ],
        [0.8784314 , 0.5372549 , 0.5137255 ],
        [0.8784314 , 0.5372549 , 0.49019608],
        ...,
        [0.9137255 , 0.58431375, 0.4862745 ],
        [0.8980392 , 0.5686275 , 0.47058824],
        [0.8117647 , 0.42745098, 0.3764706 ]],

       [[0.8862745 , 0.5372549 , 0.4862745 ],
        [0.8784314 , 0.5372549 , 0.5137255 ],
        [0.8784314 , 0.5372549 , 0.49019608],
        ...,
        [0.9098039 , 0.5803922 , 0.4862745 ],
        [0.8980392 , 0.5647059 , 0.46666667],
        [0.80784315, 0.42352942, 0.3764706 ]],

       [[0.8862745 , 0.5372549 , 0.49019608],
        [0.8745098 , 0.5372549 , 0.5176471 ],
        [0.8784314 , 0.5372549 , 0.49411765],
        ...,
        [0.92156863, 0.5921569 , 0.49411765],
        [0.9137255 , 0.5882353 , 0.47843137],
        [0.83137256, 0.44313726, 0.38431373]],

       ...,

       [[0.34901962, 0.10588235, 0.2509804 ],
        [0.34509805, 0.09803922, 0.23529412],
        [0.35686275, 0

In [58]:
img.shape

(330, 330, 3)

In [59]:
# Red Channel
img[:, :, 0]

array([[0.8862745 , 0.8784314 , 0.8784314 , ..., 0.9137255 , 0.8980392 ,
        0.8117647 ],
       [0.8862745 , 0.8784314 , 0.8784314 , ..., 0.9098039 , 0.8980392 ,
        0.80784315],
       [0.8862745 , 0.8745098 , 0.8784314 , ..., 0.92156863, 0.9137255 ,
        0.83137256],
       ...,
       [0.34901962, 0.34509805, 0.35686275, ..., 0.6313726 , 0.65882355,
        0.6431373 ],
       [0.32156864, 0.3529412 , 0.37254903, ..., 0.6745098 , 0.6862745 ,
        0.69411767],
       [0.31764707, 0.35686275, 0.3764706 , ..., 0.69411767, 0.7058824 ,
        0.72156864]], dtype=float32)

In [60]:
# Green Channel
img[:, :, 1]

array([[0.5372549 , 0.5372549 , 0.5372549 , ..., 0.58431375, 0.5686275 ,
        0.42745098],
       [0.5372549 , 0.5372549 , 0.5372549 , ..., 0.5803922 , 0.5647059 ,
        0.42352942],
       [0.5372549 , 0.5372549 , 0.5372549 , ..., 0.5921569 , 0.5882353 ,
        0.44313726],
       ...,
       [0.10588235, 0.09803922, 0.10588235, ..., 0.2509804 , 0.2627451 ,
        0.23921569],
       [0.07058824, 0.09803922, 0.10980392, ..., 0.27058825, 0.2784314 ,
        0.2627451 ],
       [0.08627451, 0.11372549, 0.11764706, ..., 0.27450982, 0.27058825,
        0.2901961 ]], dtype=float32)

In [61]:
# Blue Channel
img[:, :, 2]

array([[0.4862745 , 0.5137255 , 0.49019608, ..., 0.4862745 , 0.47058824,
        0.3764706 ],
       [0.4862745 , 0.5137255 , 0.49019608, ..., 0.4862745 , 0.46666667,
        0.3764706 ],
       [0.49019608, 0.5176471 , 0.49411765, ..., 0.49411765, 0.47843137,
        0.38431373],
       ...,
       [0.2509804 , 0.23529412, 0.24705882, ..., 0.30980393, 0.3019608 ,
        0.3019608 ],
       [0.23137255, 0.22745098, 0.24313726, ..., 0.3254902 , 0.3137255 ,
        0.30980393],
       [0.22352941, 0.23529412, 0.2509804 , ..., 0.32156864, 0.30588236,
        0.32156864]], dtype=float32)