### Question 1: What is Concatenation?

* One common task you’ll need to accomplish with any language involves merging or combining strings. This process is referred to as concatenation.

* The best way to describe it is when you take two separate strings – stored by the interpreter – and merge them so that they become one.

* For instance, one string would be “hello” and the other would be “world.” When you use concatenation to combine them it becomes one string, or “hello world”.


In [11]:
str1 = "Hello"
str2 = " World"
str1 + str2

'Hello World'

#### Note: One thing to note is that Python cannot concatenate a string and integer. These are considered two separate types of objects. So, if you want to merge the two, you will need to convert the integer to a string.



##### NumPy Concatenation



Often you may have two or more NumPY arrays and want to concatenate/join/merge them into a single array. Python offers multiple options to join/concatenate NumPy arrays.

Common operations include given two 2d-arrays, how can we concatenate them row wise or column wise. NumPy’s concatenate function allows you to concatenate two arrays either by rows or by columns. Let us see a couple of examples of NumPy’s concatenate function.

Let us first import the NumPy package.

In [1]:
# import numpy
import numpy as np

Let us create a NumPy array using arange function in NumPy. The 1d-array starts at 0 and ends at 8

In [3]:
array = np.arange(9)
array2D_1 = array.reshape((3,3))
array2D_1

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [4]:
array2D_2 = np.arange(10,19).reshape(3,3)
array2D_2
 

array([[10, 11, 12],
       [13, 14, 15],
       [16, 17, 18]])

#### NumPy concatenate

NumPy’s concatenate function can be used to concatenate two arrays either row-wise or column-wise. Concatenate function can take two or more arrays of the same shape and by default it concatenates row-wise i.e. axis=0. The resulting array after row-wise concatenation is of the shape 6 x 3, i.e. 6 rows and 3 columns.

##### How To Concatenate 2 NumPy Arrays Row-wise?

In [5]:
# concatenate 2 numpy arrays: row-wise
np.concatenate((array2D_1, array2D_2))

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [10, 11, 12],
       [13, 14, 15],
       [16, 17, 18]])

##### How To Concatenate 2 NumPy Arrays Column-wise?

In [6]:
# concatenate 2 numpy arrays: column-wise
np.concatenate((array2D_1,array2D_2),axis=1)

array([[ 0,  1,  2, 10, 11, 12],
       [ 3,  4,  5, 13, 14, 15],
       [ 6,  7,  8, 16, 17, 18]])

##### NumPy vstack example

NumPy’s vstack stacks arrays in sequence vertically i.e. row wise. And the result is the same as using concatenate with axis=0.

In [7]:
np.vstack((array2D_1, array2D_2))

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [10, 11, 12],
       [13, 14, 15],
       [16, 17, 18]])

##### NumPy hstack example

NumPy’s hstack stacks arrays horizontally i.e. column wise. And the result is the same as using concatenate with axis=1.

In [8]:
np.hstack((array2D_1, array2D_2))

array([[ 0,  1,  2, 10, 11, 12],
       [ 3,  4,  5, 13, 14, 15],
       [ 6,  7,  8, 16, 17, 18]])

### Question 2: What is the difference between append and Concatenation?



Append : The append method adds a new item to the end of a list. It is also possible to add a new item to the end of a list by using the concatenation operator. However, you need to be careful.

Consider the following example. The original list has 3 integers. We want to add the word “Fight Club” to the end of the list.

In [16]:
origlist = [1, 2, 85]
origlist.append("Fight Club")

In [17]:
origlist

[1, 2, 85, 'Fight Club']

Concatenate:  In order to use concatenation, we need to write an assignment statement that uses the accumulator pattern:

In [22]:
origlist = origlist + ["cat"]

In [23]:
origlist

[1, 2, 85, 'Fight Club', 'cat', 'cat']

##### Note: One way to think of the difference is that .append() is a specific case of a concatenation, while pd.concat() gives you more flexibility
1. Concat is used to add a String at the end of another String. Append adds a String          or character sequence
2. Concat creates a new String object. 
3. Append is more efficient than concat



### Pandas concat Vs append

* Concat gives the flexibility to join based on the axis( all rows or all columns)

* Append is the specific case(axis=0, join='outer') of concat 

#### .concat()

In [32]:
a=[1,2,3,4,5]
b=[0,9,8,7,6]
np.concatenate([a,b])

array([1, 2, 3, 4, 5, 0, 9, 8, 7, 6])

##### .append()

In [33]:
np.append([1, 2, 3], [[4, 5, 6], [7, 8, 9]])

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

### Question 3
##### Explain Primary Key, Composite Key and Foreign Key

##### 1. Primary Key
A primary key is a table column that can be used to uniquely identify every row of the table. Any column that has this property will do -- these columns are called candidate keys. A table can have many candidate keys but only one primary key. The primary key cannot be null.



In this example, only FirstName is a single-column candidate key, because it is the only column that is unique and not null.

##### 2. Composite Key
A composite primary key is a primary key consisting of more than one column. For example, the combinations (RecordNo,FirstName), (RecordNo,Lastname), (RecordNo,FirstName,Lastname), and (FirstName,LastName) are all candidate keys. Any combination including Age is not a candidate key because it contains a null.1

##### 3. Foreign Key
A foreign key is a column, or combination of columns, that contain values that are found in the primary key of some table (including, possibly, itself). A foreign key may be null, and almost always is not unique.

Here we have two tables that are related via a foreign key.

As you can see, the ClrFK column in the second table is a foreign key to the ClrPK primary key in the first table. Notice that the ClrPK values are unique and not null, but the ClrFK values may be null and often repeat. A null foreign key means that that particular row does not participate in the relationship. The fact that many foreign key values repeat simply reflects the fact that it's a one-to-many relationship.

In a one-to-many relationship, the primary key has the "one" value, and the foreign key has the "many" values. The trick to remembering this is to keep in mind that the primary key must be unique.

### Question 4
###### Explain Concept of Joins

In order to join dataframe, we use .join() function this function is used for combining the columns of two potentially differently-indexed DataFrames into a single result DataFrame.

In [1]:
# importing pandas module
import pandas as pd 
  
# Define a dictionary containing employee data 
data1 = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'], 
        'Age':[27, 24, 22, 32]} 
    
# Define a dictionary containing employee data 
data2 = {'Address':['Allahabad', 'Kannuaj', 'Allahabad', 'Kannuaj'], 
        'Qualification':['MCA', 'Phd', 'Bcom', 'B.hons']} 
  
# Convert the dictionary into DataFrame  
df = pd.DataFrame(data1,index=['K0', 'K1', 'K2', 'K3'])
  
# Convert the dictionary into DataFrame  
df1 = pd.DataFrame(data2, index=['K0', 'K2', 'K3', 'K4'])
 
 
print(df, "\n\n", df1)  

      Name  Age
K0     Jai   27
K1  Princi   24
K2  Gaurav   22
K3    Anuj   32 

       Address Qualification
K0  Allahabad           MCA
K2    Kannuaj           Phd
K3  Allahabad          Bcom
K4    Kannuaj        B.hons


In [2]:
# joining dataframe
res = df.join(df1)
 
res

Unnamed: 0,Name,Age,Address,Qualification
K0,Jai,27,Allahabad,MCA
K1,Princi,24,,
K2,Gaurav,22,Kannuaj,Phd
K3,Anuj,32,Allahabad,Bcom


##### Now we use how = 'outer' in order to get union

In [3]:
# getting union
res1 = df.join(df1, how='outer')
 
res1

Unnamed: 0,Name,Age,Address,Qualification
K0,Jai,27.0,Allahabad,MCA
K1,Princi,24.0,,
K2,Gaurav,22.0,Kannuaj,Phd
K3,Anuj,32.0,Allahabad,Bcom
K4,,,Kannuaj,B.hons


##### Joining dataframe using on in an argument :
In order to join dataframes we use on in an argument. join() takes an optional on argument which may be a column or multiple column names, which specifies that the passed DataFrame is to be aligned on that column in the DataFrame. 

In [5]:
# importing pandas module
import pandas as pd 
  
# Define a dictionary containing employee data 
data1 = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'], 
        'Age':[27, 24, 22, 32],
        'Key':['K0', 'K1', 'K2', 'K3']} 
    
# Define a dictionary containing employee data 
data2 = {'Address':['Allahabad', 'Kannuaj', 'Allahabad', 'Kannuaj'], 
        'Qualification':['MCA', 'Phd', 'Bcom', 'B.hons']} 
  
# Convert the dictionary into DataFrame  
df = pd.DataFrame(data1)
  
# Convert the dictionary into DataFrame  
df1 = pd.DataFrame(data2, index=['K0', 'K2', 'K3', 'K4'])
 
 
print(df, "\n\n", df1) 

     Name  Age Key
0     Jai   27  K0
1  Princi   24  K1
2  Gaurav   22  K2
3    Anuj   32  K3 

       Address Qualification
K0  Allahabad           MCA
K2    Kannuaj           Phd
K3  Allahabad          Bcom
K4    Kannuaj        B.hons


In [6]:
# using on argument in join
res2 = df.join(df1, on='Key')
 
res2

Unnamed: 0,Name,Age,Key,Address,Qualification
0,Jai,27,K0,Allahabad,MCA
1,Princi,24,K1,,
2,Gaurav,22,K2,Kannuaj,Phd
3,Anuj,32,K3,Allahabad,Bcom


##### Joining singly-indexed DataFrame with multi-indexed DataFrame :
In order to join singly indexed dataframe with multi-indexed dataframe, the level will match on the name of the index of the singly-indexed frame against a level name of the multi-indexed frame.

In [7]:
# importing pandas module
import pandas as pd 
  
# Define a dictionary containing employee data 
data1 = {'Name':['Jai', 'Princi', 'Gaurav'], 
        'Age':[27, 24, 22]} 
    
# Define a dictionary containing employee data 
data2 = {'Address':['Allahabad', 'Kannuaj', 'Allahabad', 'Kanpur'], 
        'Qualification':['MCA', 'Phd', 'Bcom', 'B.hons']} 
  
# Convert the dictionary into DataFrame  
df = pd.DataFrame(data1, index=pd.Index(['K0', 'K1', 'K2'], name='key'))
 
index = pd.MultiIndex.from_tuples([('K0', 'Y0'), ('K1', 'Y1'),
                                   ('K2', 'Y2'), ('K2', 'Y3')],
                                   names=['key', 'Y'])
  
# Convert the dictionary into DataFrame  
df1 = pd.DataFrame(data2, index= index)
 
 
print(df, "\n\n", df1)

       Name  Age
key             
K0      Jai   27
K1   Princi   24
K2   Gaurav   22 

           Address Qualification
key Y                          
K0  Y0  Allahabad           MCA
K1  Y1    Kannuaj           Phd
K2  Y2  Allahabad          Bcom
    Y3     Kanpur        B.hons


In [8]:
# joining singly indexed with
# multi indexed
result = df.join(df1, how='inner')
 
result

Unnamed: 0_level_0,Unnamed: 1_level_0,Name,Age,Address,Qualification
key,Y,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
K0,Y0,Jai,27,Allahabad,MCA
K1,Y1,Princi,24,Kannuaj,Phd
K2,Y2,Gaurav,22,Allahabad,Bcom
K2,Y3,Gaurav,22,Kanpur,B.hons


There are four different types of joins: 
    

* (INNER) JOIN: Returns records that have matching values in both tables
* LEFT (OUTER) JOIN: Return all records from the left table, and the matched records from the right table
* RIGHT (OUTER) JOIN:Return all records from the right table, and the matched records from the left table
* FULL (OUTER) JOIN: Return all records when there is a match in either left or right table
