### Creating Pandas DataFrame

#### Using DataFrame constructor pd.DataFrame()

The pandas DataFrame() constructor offers many different ways to create and initialize a dataframe.

#### Method 0:

In [4]:
import numpy as np
import pandas as pd

In [2]:
# method 0
# Initialize a blank dataframe 

df = pd.DataFrame()
print(df)
print(df.shape)

# Initialize a blank dataframe with coulmn names and keep adding

df1 = pd.DataFrame(columns=['Name','Age','Address'])
print(df1)

# Add records to dataframe using the .loc function
df1.loc[0] = ['Anil',23,'Delhi']
df1.loc[1] = ['Ram',34, 'EDelhi']
print(df1)

Empty DataFrame
Columns: []
Index: []
(0, 0)
Empty DataFrame
Columns: [Name, Age, Address]
Index: []
   Name  Age Address
0  Anil   23   Delhi
1   Ram   34  EDelhi


In [3]:
# Add records to dataframe using the .loc function
df1.loc[0] = ['Anil',23,'Delhi']
df1.loc[1] = ['Ram',34, 'EDelhi']
df1.loc[3] = ['Himani',23,'FBD']
print(df1)

     Name  Age Address
0    Anil   23   Delhi
1     Ram   34  EDelhi
3  Himani   23     FBD


#### Method 1:
Using numpy array in the DataFrame constructor. Pass a 2D numpy array — each array is the corresponding row in the dataframe

In [4]:
# Pass a 2D numpy array - each row is the corresponding row required in the dataframe

data = np.array([['India','INR','Hindi'],
                 ['USA','Dollar','Eng'],
                ['UK','Pound','Eng']])

# pass column names in the columns parameter 

df2 = pd.DataFrame(data, columns=['Country','Currency','Language'])
df2

Unnamed: 0,Country,Currency,Language
0,India,INR,Hindi
1,USA,Dollar,Eng
2,UK,Pound,Eng


#### Method 2:

Using dictionary in the DataFrame constructor. Dictionary Keys become Column names in the dataframe. Dictionary values become the values of columns. Column values are combined in a single row according to the order in which they are specified

In [5]:
# Creating a dictionary

data = {'year': [2014, 2018,2020,2017], 
        'make': ["toyota", "honda","hyndai","nissan"],
        'model':["corolla", "civic","accent","sentra"]
       }

# Creating a dataframe using above dictionary
df3 = pd.DataFrame(data)
df3

Unnamed: 0,year,make,model
0,2014,toyota,corolla
1,2018,honda,civic
2,2020,hyndai,accent
3,2017,nissan,sentra


#### Method 3:

Using a list of dictionaries in the DataFrame constructor. Each dictionary is a record. Dictionary Keys become Column names in the dataframe. Dictionary values become the values of columns

In [6]:
data = [{'year': 2014, 'make': "toyota", 'model':"corolla"}, 
        {'year': 2018, 'make': "honda", 'model':"civic"}, 
        {'year': 2020, 'make': "hyndai", 'model':"nissan"}, 
        {'year': 2017, 'make': "nissan" ,'model':"sentra"}
       ]

# pass column names in the columns parameter 
df4 = pd.DataFrame(data)
df4

Unnamed: 0,year,make,model
0,2014,toyota,corolla
1,2018,honda,civic
2,2020,hyndai,nissan
3,2017,nissan,sentra


#### Method 4:

Using dictionary in the from_dict method. Dictionary Keys become Column names in the dataframe. Dictionary values become the vaues of columns. Column values are combined in a single row according to the order in which they are specified. ```pd.DataFrame.from_dict(data)``

In [7]:
data = {'year': [2014, 2018,2020,2017], 
        'make': ["toyota", "honda","hyndai","nissan"],
        'model':["corolla", "civic","accent","sentra"]
       }

# pass column names in the columns parameter 
# using pd.DataFrame.from_dict(...)

df5 = pd.DataFrame.from_dict(data)
df5

Unnamed: 0,year,make,model
0,2014,toyota,corolla
1,2018,honda,civic
2,2020,hyndai,accent
3,2017,nissan,sentra


### Using pandas library functions — read_csv

#### Method 5:

From a csv file using read_csv method of pandas library. This is one of the most common ways of dataframe creation for EDA. Delimiter (or separator) , header and the choice of index column from the csv file is configurable. By default, separator is comma, header is inferred from first line if found, index column is not taken from the file. Here is how the file looks like:

In [8]:
df6 = pd.read_csv("DataSets/forbes_2022_billionaires.csv", )
df6

Unnamed: 0,rank,personName,age,finalWorth,year,month,category,source,country,state,...,organization,selfMade,gender,birthDate,title,philanthropyScore,residenceMsa,numberOfSiblings,bio,about
0,1,Elon Musk,50.0,219000.0,2022,4,Automotive,"Tesla, SpaceX",United States,Texas,...,Tesla,True,M,1971-06-28,CEO,1.0,,,Elon Musk is working to revolutionize transpor...,Musk was accepted to a graduate program at Sta...
1,2,Jeff Bezos,58.0,171000.0,2022,4,Technology,Amazon,United States,Washington,...,Amazon,True,M,1964-01-12,Entrepreneur,1.0,"Seattle-Tacoma-Bellevue, WA",,Jeff Bezos founded e-commerce giant Amazon in ...,"Growing up, Jeff Bezos worked summers on his g..."
2,3,Bernard Arnault & family,73.0,158000.0,2022,4,Fashion & Retail,LVMH,France,,...,LVMH Moët Hennessy Louis Vuitton,False,M,1949-03-05,Chairman and CEO,,,,Bernard Arnault oversees the LVMH empire of so...,"Arnault apparently wooed his wife, Helene Merc..."
3,4,Bill Gates,66.0,129000.0,2022,4,Technology,Microsoft,United States,Washington,...,Bill & Melinda Gates Foundation,True,M,1955-10-28,Cofounder,4.0,"Seattle-Tacoma-Bellevue, WA",,Bill Gates turned his fortune from software fi...,"When Gates was a kid, he spent so much time re..."
4,5,Warren Buffett,91.0,118000.0,2022,4,Finance & Investments,Berkshire Hathaway,United States,Nebraska,...,Berkshire Hathaway,True,M,1930-08-30,CEO,5.0,"Omaha, NE",,"Known as the ""Oracle of Omaha,"" Warren Buffett...","Buffett still lives in the same Omaha, Nebrask..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2663,2578,Zhang Yuqiang,66.0,1000.0,2022,4,Manufacturing,Fiberglass,China,,...,,True,M,1955-09-01,,,,,"Zhang Yuqiang chairs Zhenshi Holding Group, a ...",
2664,2578,Zhou Ruxin,59.0,1000.0,2022,4,Technology,Navigation,China,,...,,True,M,1963-03-01,,,,,"Zhou Ruxin chairs Beijing BDStar Navigation, a...",
2665,2578,Wen Zhou & family,57.0,1000.0,2022,4,Manufacturing,chemicals,China,,...,,True,M,1965-03-06,,,,,"Zhou Wen chairs Shanghai Pret Composites, a su...",
2666,2578,Zhou Yifeng & family,43.0,1000.0,2022,4,Energy,liquefied petroleum gas,China,,...,,True,F,1978-07-11,,,,,Zhou Yifeng chairs Shenzhen-listed Oriental En...,


### Using pandas library functions — read_excel

#### Method 6:
From an excel file using read_csv method of pandas library. This is one of the most common ways of dataframe creation for EDA. Delimiter (or separator) , header and the choice of index column from the excel file is configurable. By default, separator is comma, header is inferred from first line if found, index column is not taken from the file. Here is how the file looks like:

In [9]:
df7 = pd.read_excel('~\Superstore.xls')
df7

Unnamed: 0,Row ID,Order ID,Order Date,Ship Date,Ship Mode,Customer ID,Customer Name,Segment,Country,City,...,Postal Code,Region,Product ID,Category,Sub-Category,Product Name,Sales,Quantity,Discount,Profit
0,1,CA-2016-152156,2016-11-08,2016-11-11,Second Class,CG-12520,Claire Gute,Consumer,United States,Henderson,...,42420,South,FUR-BO-10001798,Furniture,Bookcases,Bush Somerset Collection Bookcase,261.9600,2,0.00,41.9136
1,2,CA-2016-152156,2016-11-08,2016-11-11,Second Class,CG-12520,Claire Gute,Consumer,United States,Henderson,...,42420,South,FUR-CH-10000454,Furniture,Chairs,"Hon Deluxe Fabric Upholstered Stacking Chairs,...",731.9400,3,0.00,219.5820
2,3,CA-2016-138688,2016-06-12,2016-06-16,Second Class,DV-13045,Darrin Van Huff,Corporate,United States,Los Angeles,...,90036,West,OFF-LA-10000240,Office Supplies,Labels,Self-Adhesive Address Labels for Typewriters b...,14.6200,2,0.00,6.8714
3,4,US-2015-108966,2015-10-11,2015-10-18,Standard Class,SO-20335,Sean O'Donnell,Consumer,United States,Fort Lauderdale,...,33311,South,FUR-TA-10000577,Furniture,Tables,Bretford CR4500 Series Slim Rectangular Table,957.5775,5,0.45,-383.0310
4,5,US-2015-108966,2015-10-11,2015-10-18,Standard Class,SO-20335,Sean O'Donnell,Consumer,United States,Fort Lauderdale,...,33311,South,OFF-ST-10000760,Office Supplies,Storage,Eldon Fold 'N Roll Cart System,22.3680,2,0.20,2.5164
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9989,9990,CA-2014-110422,2014-01-21,2014-01-23,Second Class,TB-21400,Tom Boeckenhauer,Consumer,United States,Miami,...,33180,South,FUR-FU-10001889,Furniture,Furnishings,Ultra Door Pull Handle,25.2480,3,0.20,4.1028
9990,9991,CA-2017-121258,2017-02-26,2017-03-03,Standard Class,DB-13060,Dave Brooks,Consumer,United States,Costa Mesa,...,92627,West,FUR-FU-10000747,Furniture,Furnishings,Tenex B1-RE Series Chair Mats for Low Pile Car...,91.9600,2,0.00,15.6332
9991,9992,CA-2017-121258,2017-02-26,2017-03-03,Standard Class,DB-13060,Dave Brooks,Consumer,United States,Costa Mesa,...,92627,West,TEC-PH-10003645,Technology,Phones,Aastra 57i VoIP phone,258.5760,2,0.20,19.3932
9992,9993,CA-2017-121258,2017-02-26,2017-03-03,Standard Class,DB-13060,Dave Brooks,Consumer,United States,Costa Mesa,...,92627,West,OFF-PA-10004041,Office Supplies,Paper,"It's Hot Message Books with Stickers, 2 3/4"" x 5""",29.6000,4,0.00,13.3200


In [10]:
df7 = pd.read_excel('~\Superstore.xls', sheet_name=[0,2])
df7

{0:       Row ID        Order ID Order Date  Ship Date       Ship Mode  \
 0          1  CA-2016-152156 2016-11-08 2016-11-11    Second Class   
 1          2  CA-2016-152156 2016-11-08 2016-11-11    Second Class   
 2          3  CA-2016-138688 2016-06-12 2016-06-16    Second Class   
 3          4  US-2015-108966 2015-10-11 2015-10-18  Standard Class   
 4          5  US-2015-108966 2015-10-11 2015-10-18  Standard Class   
 ...      ...             ...        ...        ...             ...   
 9989    9990  CA-2014-110422 2014-01-21 2014-01-23    Second Class   
 9990    9991  CA-2017-121258 2017-02-26 2017-03-03  Standard Class   
 9991    9992  CA-2017-121258 2017-02-26 2017-03-03  Standard Class   
 9992    9993  CA-2017-121258 2017-02-26 2017-03-03  Standard Class   
 9993    9994  CA-2017-119914 2017-05-04 2017-05-09    Second Class   
 
      Customer ID     Customer Name    Segment        Country             City  \
 0       CG-12520       Claire Gute   Consumer  United States

### Using pandas library functions — read_json

#### Method 7:

From a json file using read_json method of pandas library when the json file has a record in each line. Setting lines=True mean Read the file as a json object per line. Here is how the json file looks like:

In [11]:
# Creating a json file

df7.to_json('superstore.json')

# Reading a json file

df8 = pd.read_json('superstore.json')
df8

AttributeError: 'dict' object has no attribute 'to_json'

#### Method 8:

From a string of csv records using read_csv method of pandas library. This is particularly useful when we dont want to create a file but we have record structures handy- all we do is convert a csv record “string” to a file handle using StringIO library function.

In [None]:
import io

# f is a file handle created from a csv like string
# StringIO(string)

f= 'year,make,model\n2014,toyota,corolla\n2018,honda,civic\n2020,hyndai,accent\n2017,nissan,sentra'
f = io.StringIO(f)
df9 = pd.read_csv(f)
df9

#### Method 9:

From a string of json records using read_json method of pandas library. This is particularly useful when we dont want to create a file but we have json record structures handy.

In [None]:
from io import StringIO
# Home Work




#### Method 10: Reading HTML

We can read tables of an HTML file using the read_html() function. This function read tables of HTML files as Pandas DataFrames. It can read from a file or a URL.

Let's have a look at each input source one by one.

#### Reading HTML Data From a File

For this, I'll use one set of input data. One table contains programming languages and the year of their creation. The other table has land sizes and their cost in USD.

Save the following HTML content in a file called table_data.html:

In [1]:
!pip install lxml




[notice] A new release of pip available: 22.2.2 -> 22.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [24]:
import pandas as pd

tables = pd.read_html('mytable.html')
print('Tables found:', len(tables))
df1 = tables[0]  # Save first table in variable df1
df2 = tables[1]  # Saving next table in variable df2

print('First Table')
print(df1)
print('Another Table')
print(df2)

Tables found: 2
First Table
  Programming Language             Creator  Year
0                    C      Dennis Ritchie  1972
1               Python    Guido Van Rossum  1989
2                 Ruby  Yukihiro Matsumoto  1995
Another Table
   Area (sq.ft)  Price (USD)
0         12000          500
1         32000          700


#### Reading HTML Data From URL

Just as we read table elements from an HTML file, we can also read table elements from an HTML web page into a DataFrame with read_html(). However, in place of the file name, we will provide a URL like this:

    read_html('https://en.wikipedia.org/wiki/Python_(programming_language)')

In [6]:
tables = pd.read_html('https://en.wikipedia.org/wiki/Python_(programming_language)')
print('Tables found:', len(tables))
df1 = tables[0]  # Save first table in variable df1
print('First Table')
print(df1.head())  # To print first 5 rows

Tables found: 13
First Table
             0                                                  1
0          NaN                                                NaN
1          NaN                                                NaN
2     Paradigm  Multi-paradigm: object-oriented,[1] procedural...
3  Designed by                                   Guido van Rossum
4    Developer                         Python Software Foundation


### Altrenate solution

In [9]:
import requests,pandas

url = 'https://www.goodcarbadcar.net/2020-us-vehicle-sales-figures-by-brand'
r = requests.get(url)

#if the response status is OK (200)
if r.status_code == 200:
    # from the response object, pass the response text 
    # to read_html and get list of tables as list of dataframes
    
     car_data_tables = pandas.read_html(r.text,displayed_only=False)

# display the first table
car_data_tables[0]

Unnamed: 0,Brand,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
0,Acura,9230,12264,7037,5046,10341,12071,13076,13647,12941,13790,11891,15648
1,Alfa Romeo,1202,1557,943,672,1494,1569,1773,1576,1707,2055,1688,2349
2,Audi,13438,17396,10537,6270,13935,14634,16795,14928,16173,21090,17327,24102
3,BMW,22009,25002,15141,9171,20382,21403,24351,21565,22523,33313,27363,38074
4,Buick,11001,14242,8627,6395,14207,14919,17238,15333,16599,14907,12244,17037
5,Cadillac,9849,12752,7722,4194,9317,9786,11560,10275,11131,14475,11890,16544
6,Chevrolet,140809,178302,110418,59471,132149,138761,157488,139990,151656,175755,144370,200864
7,Chrysler,9785,12667,7493,2494,5543,5820,11175,9933,10761,11677,9592,13345
8,Dodge,28795,37279,22582,7875,17503,18378,25224,22421,24290,21246,17452,24281
9,Fiat,366,475,287,241,536,562,386,344,372,248,203,284


#### Reading HTML Data From URL That Requires Authentication

We know that we can read table elements from a website. However, when the site requires authentication, the code runs into the following exception:

To read data from such URLs we will use the requests module. You can install it with pip:

    !pip install requests

In [10]:
!pip install requests




[notice] A new release of pip available: 22.2.2 -> 22.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip





In [11]:
import requests

r = requests.get('https://httpbin.org/basic-auth/john/johnspassword', auth=('john', 'johnspassword'))

print(r.status_code)
print(r.text)

200
{
  "authenticated": true, 
  "user": "john"
}



**This shows that we successfully accessed the web page content of an authenticated URL. However, this website only contains JSON data and we need HTML table elements as DataFrames.**

**Let's stick to the earlier URL and use requests to read HTML tables as DataFrames. While the previous site was public, the steps to access authenticated content is the same.**

**Once we get a response, we can pass the r.text to read_html() method. And as usual, we'll get a list of tables it contains as DataFrames:**

In [12]:
import pandas as pd
import requests

# Can use auth parameter for authenticated URLs
r = requests.get('https://en.wikipedia.org/wiki/Python_(programming_language)',
                 auth=('john', 'johnspassword'))
tables = pd.read_html(r.text)
print('Tables found:', len(tables))
df1 = tables[0]
print('First Table')
print(df1.head())

Tables found: 13
First Table
             0                                                  1
0          NaN                                                NaN
1          NaN                                                NaN
2     Paradigm  Multi-paradigm: object-oriented,[1] procedural...
3  Designed by                                   Guido van Rossum
4    Developer                         Python Software Foundation


### From other dataframes

#### Method 11:

As a copy of another dataframe.

In [None]:
df_copy = df.copy()   # copy into a new dataframe object
df_copy = df          # make an alias of the dataframe(not creating 
                      # a new dataframe, just a pointer)

In [None]:
# as a new object using .copy() method - new dataframe object created independent of old one
a = pd.DataFrame({'year': [2019],'make': ["Mercedes"],'model':["C-Class"]})
b = a.copy()
# change old one
a['year'] = 2020
# new copy does not reflect the change
b

In [None]:
# as variable copy - new variable is just an alias to the old one
a = pd.DataFrame({'year': [2019],'make': ["Mercedes"],'model':["C-Class"]})
b = a
# change old one
a['year'] = 2020
# alias reflects the change
b

#### Method 12:

Vertical concatenation — one on top of the other

In [None]:
data1 = [        
        {'year': 2018, 'make': "honda", 'model':"civic"}, 
        {'year': 2020, 'make': "hyndai", 'model':"nissan"}, 
        {'year': 2017, 'make': "nissan" ,'model':"sentra"}
       ]
df1 = pd.DataFrame(data1)
data2 = [{'year': 2019, 'make': "bmw", 'model':"x5"}]
df2 = pd.DataFrame(data2)
# concatenate vertically
# NOTE: axis = 'index' is same as axis = 0, and is the default 
# The two statements below mean the same as the one above
df3 = pd.concat([df1,df2], axis = 'index') 
#OR
df3 = pd.concat([df1,df2], axis = 0)
# OR
df3 = pd.concat([df1,df2])
df3

In [None]:
df3 = pd.concat([df1,df2]).reset_index()
#OR
df3 = pd.concat([df1,df2], ignore_index = True)
df3

#### Method 13:

Horizontal concatenation — append side by side, not joined by any key

In [None]:
data1 = [{'year': 2014, 'make': "toyota", 'model':"corolla"}, 
        {'year': 2018, 'make': "honda", 'model':"civic"}, 
        {'year': 2020, 'make': "hyndai", 'model':"nissan"}, 
        {'year': 2017, 'make': "nissan" ,'model':"sentra"}
       ]
df1 = pd.DataFrame(data1)
data2 = [{'year': 2019, 'make': "bmw", 'model':"x5"}]
df2 = pd.DataFrame(data2)
df3 = pd.concat([df1,df2], axis = 'columns')
#OR
df3 = pd.concat([df1,df2], axis = 1)
df3

#### Method 14:

Horizontal concatenation — equivalent of SQL join.
Inner join

In [None]:
data1 = [{'year': 2014, 'make': "toyota", 'model':"corolla"}, 
        {'year': 2018, 'make': "honda", 'model':"civic"}, 
        {'year': 2020, 'make': "hyndai", 'model':"nissan"}, 
        {'year': 2017, 'make': "nissan" ,'model':"sentra"}
       ]
df1 = pd.DataFrame(data1)
data2 = [{'make': 'honda', 'Monthly Sales': 114117}, 
        {'make': 'toyota', 'Monthly Sales': 172370}, 
        {'make': 'hyndai', 'Monthly Sales': 54790}
       ]
df2 = pd.DataFrame(data2)
# inner join on 'make'
# default is inner join
df3 = pd.merge(df1,df2,how = 'inner',on = ['make'])
df3 = pd.merge(df1,df2,on = ['make'])
df3

In [None]:
Left join

# for a left join , use how = 'left'
df3 = pd.merge(df1,df2,how = 'left',on = ['make'])
df3

#### Method 15:

As a transpose of another dataframe:

In [None]:
# To transpose a dataframe - use .T method
df4 = df3.T
# To rename columns to anything else after the transpose
df4.columns = (['column1','column2','column3','column4'])
df4

In [44]:
import pandas as pd

df = pd.DataFrame(columns=['N1','N2'])
df.loc[0] = [1,2]
df.loc[1] = [3,4]
df.loc[2] = [5,6]
df

Unnamed: 0,N1,N2
0,1,2
1,3,4
2,5,6


In [50]:
df1

Unnamed: 0,0,1,2
N1,1,3,5
N2,2,4,6


In [51]:
df1 = df.transpose()
df1.columns = ['N1','N2','N3']
df1

Unnamed: 0,N1,N2,N3
N1,1,3,5
N2,2,4,6


#### Method 16:

Conversion to one-hot columns (used for modeling with learning algorithms) using pandas get_dummies function.

One-Hot is basically a conversion of a column value into a set of derived columns like Binary Representation Any one of the one-hot column set is 1 and rest is 0.

If we know that a car has body types = SEDAN, SUV, VAN, TRUCK, then a Toyota corolla with body = ‘SEDAN’ will become one-hot encoded to

    body_SEDAN   body_SUV    body_VAN   body_TRUCK
    1             0               0         0


Each one hot column is basically of the format 

    <original_column_name>_<possible_value>

#### Below is an example:

In [55]:
data1 = [{ 'make': "toyota", 'model':"corolla", 'body':"sedan"}, 
        {'make': "honda", 'model':"crv", 'body':"suv"}, 
        {'make': "dodge", 'model':"caravan", 'body':"van"}, 
        {'make': "ford" ,'model':"f150", 'body':"truck"}
       ]
df1 = pd.DataFrame(data1) 

df2 = pd.get_dummies(df1)
df2

Unnamed: 0,make_dodge,make_ford,make_honda,make_toyota,model_caravan,model_corolla,model_crv,model_f150,body_sedan,body_suv,body_truck,body_van
0,0,0,0,1,0,1,0,0,1,0,0,0
1,0,0,1,0,0,0,1,0,0,1,0,0
2,1,0,0,0,1,0,0,0,0,0,0,1
3,0,1,0,0,0,0,0,1,0,0,1,0


#### Method 17: Reading from Databases

#### 1. from MySQL
#### 2. from SQL Server
#### 3. from SQLite

### 1. Reading from MySQL Database

## Python MySQL Server Integration

1. Using mysql.connector
2. Using pymyql

### Step 1 : Install MySQL Driver

Python needs a MySQL driver to access the MySQL database.

Here, I will use the driver "MySQL Connector".

I recommend that you use PIP to install "MySQL Connector".

PIP is most likely already installed in your Python environment.

Navigate your command line to the location of PIP, and type the following:

Download and install "MySQL Connector":

> ***C:\Users\Your Name\AppData\Local\Programs\Python\Python36-32\Scripts>python -m pip install mysql-connector-python
Now you have downloaded and installed a MySQL driver.

### Step 2: Test MySQL Connector

To test if the installation was successful, or if you already have "MySQL Connector" installed, create a Python page with the following content:

> ***import mysql.connector

### Step 3: Create Connection

Start by creating a connection to the database.

Use the username and password from your MySQL database:

    import mysql.connector

    mydb = mysql.connector.connect(
      host="localhost",
      user="yourusername",
      password="yourpassword"
    )

    print(mydb)

### Intsalling mysql connector

In [17]:
!pip install mysql.connector




[notice] A new release of pip available: 22.2.2 -> 22.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


### Creating a Database

To create a database in MySQL, use the "CREATE DATABASE" statement:

#### Example - Create a database named "mydatabase":

In [62]:
import mysql.connector

connection = mysql.connector.connect(host="localhost",
                                    user="root",
                                    password="")

my_cursor = connection.cursor()

my_cursor.execute("CREATE DATABASE my_db")
connection.close()

In [1]:
!pip install pymysql




[notice] A new release of pip available: 22.2.2 -> 22.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [5]:
import pymysql

con_db = pymysql.connect(host="127.0.0.1",
                         user="root",
                         password="",
                         database="mydb")

my_cursor = con_db.cursor()

query ="""
INSERT INTO employee VALUES
(1001,'amit jain','delhi'),
(1004,'anuj kapoor','noida'),
(1003,'amita jain','gurgaon');
"""

my_cursor.execute(query)
con_db.commit()
my_cursor.execute("SELECT*FROM employee;")
for rows in my_cursor:
    print(rows)
con_db.close()    

(1001, 'amit jain', 'delhi')
(1003, 'amita jain', 'gurgaon')
(1004, 'anuj kapoor', 'noida')


### Check if Database Exists

You can check if a database exist by listing all databases in your system by using the "SHOW DATABASES" statement:

#### Example - Return a list of your system's databases:

In [64]:
import mysql.connector

connection = mysql.connector.connect(host="localhost",
                                    user="root",
                                    password="")

my_cursor = connection.cursor()

my_cursor.execute("SHOW DATABASES;")

for dbs in my_cursor:
    print("Database list: ",dbs)
    
connection.close()

Database list:  ('cetpa_db',)
Database list:  ('college',)
Database list:  ('information_schema',)
Database list:  ('library',)
Database list:  ('my_db',)
Database list:  ('mydatabase',)
Database list:  ('mydatabase_db',)
Database list:  ('mysql',)
Database list:  ('new_db',)
Database list:  ('performance_schema',)
Database list:  ('phpmyadmin',)
Database list:  ('restaurent',)
Database list:  ('test',)


### Creating a Table

To create a table in MySQL, use the "CREATE TABLE" statement.

Make sure you define the name of the database when you create the connection

#### Example - Create a table named "student":

In [65]:
import mysql.connector

connection = mysql.connector.connect(host="localhost",
                                    user="root",
                                    password="",
                                    database="my_db")

my_cursor = connection.cursor()

query = """
CREATE TABLE marks(
stud_id INT(20),
stud_name VARCHAR(50),
stud_marks INT(20)
)
"""
my_cursor.execute(query)
connection.close()

### Check if Table Exists

You can check if a table exist by listing all tables in your database with the "SHOW TABLES" statement:

#### Example - Return a list of your system's databases:

In [66]:
import mysql.connector

connection = mysql.connector.connect(host="localhost",
                                    user="root",
                                    password="",
                                    database="my_db")

my_cursor = connection.cursor()

my_cursor.execute("SHOW TABLES;")
for tables in my_cursor:
    print("Table list: ", tables)
connection.close()

Table list:  ('marks',)


In [7]:
!pip install mysql.connector




[notice] A new release of pip available: 22.2.2 -> 22.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [6]:
import mysql.connector

connection = mysql.connector.co(host="localhost",
                                    user="root",
                                    password="123@MySQL",
                                    database="my_db")

my_cursor = connection.cursor()

my_cursor.execute("CREATE DATABASE sample_db;")
for dbs in my_cursor:
    print("Database list: ", dbs)
connection.close()

AttributeError: module 'mysql.connector' has no attribute 'connect'

### Insert Into Table
To fill a table in MySQL, use the "INSERT INTO" statement.

#### Example: Insert a record in the "customers" table:

### Insert Multiple Rows

To insert multiple rows into a table, use the executemany() method.

The second parameter of the executemany() method is a list of tuples, containing the data you want to insert:

#### Example: Fill the "customers" table with data:

In [None]:
123@MySQL

### Select From a Table

To select from a table in MySQL, use the "SELECT" statement:

#### Example: Select all records from the "customers" table, and display the result:0

In [None]:
import mysql.connector

connection = mysql.connector.connect(host="localhost",
                                    user="")

### Creating DataFrame

### 2. Reading from SQL Server

### Python SQL Server Integration using Pyodbc: 4 Step Approach

#### Step 1: Establish the Python SQL Server Connection

The first step of setting up the Python SQL Server Integration requires you to build a connection between Python and the SQL server using the pyodbc.connect function and pass a connection string. The Python MsSQL Connection string will define the DBMS Driver, connection settings, the Server, and a specific Database.

Now, for instance, you wish to connect to server USXXX00345,67800 and a database DB02 using the SQL Server Native Client 11.0.

##### There are 2 ways to establish this Python SQL Server connection:

***Approach 1 to Setup Python SQL Server Connection: You can depend on a trusted internal connection using the following code:***

cnxn_str = ("Driver={SQL Server Native Client 11.0};"
            "Server=USXXX00345,67800;"
            "Database=DB02;"
            "Trusted_Connection=yes;")
cnxn = pyodbc.connect(cnxn_str)

***Approach 2 to Setup Python SQL Server Connection: You don’t have a trusted internal connection and wish to set up the required SQL Server connection using SQL Server Management Studio (SSMS). This will require you to enter your username (say, Alex) and password(Alex123) as shown in the following code:***

cnxn_str = ("Driver={SQL Server Native Client 11.0};"
            "Server=USXXX00345,67800;"
            "Database=DB02;"
            "UID=Alex;"
            "PWD=Alex123;")
cnxn = pyodbc.connect(cnxn_str)

>***The SQL Server Native Client 11.0 ODBC Driver was released with SQL Server 2012 and can access SQL Servers from 2005 and above.***

Now, as your Python database connection is in place, you can perform SQL queries via Python.

#### Step 2: Run an SQL Query

Now, every query that you will perform on the SQL Server will involve a cursor initialization and query execution sequence. Moreover, any changes made inside the SQL Server must also reflect in Python (which is covered in Step3 of Python MS SQL Server Integration).

You can initialize a cursor via:

    cursor = cnxn.cursor()

Now, if you wish to perform a query, call this cursor object. For example, the following query will select the top 100 rows from a SQL table name associates:

    cursor.execute("SELECT TOP(100) * FROM associates")

This query will give you the desired results, however, no data will be returned to Python. To ensure that your SQL changes are reflected in Python, move on to the next step of Python SQL Server Integration.

#### Step 3: Extract Query Results to Python

To extract your data from SQL Server into Python, you will need the Pandas library. Pandas contain the “read_sql” function which is useful for reading data from SQL into Python. The read_sql requires a query and also the connection instance cnxn to extract the given data as follows:

    data = pd.read_sql("SELECT TOP(100) * FROM associates", cnxn)

This will return a data frame consisting of the top 100 rows from your associates table.

#### Step 4: Apply Modifications in SQL Server

Next, if you wish to change the SQL data, you must add another step to the query execution process. This is because when you execute SQL queries, the changes are stored in a temporary space instead of directly modifying your stored data.

To make such modifications permanent, you have to commit them. For instance, if you wish to merge the firstName and lastName columns, generate a fullName column using the below code:

    cursor = cnxn.cursor()

#### first alter the table, adding a column

    cursor.execute("ALTER TABLE associates " +
                   "ADD fullName VARCHAR(20)")

#### now update that column to contain firstName + lastName

    cursor.execute("UPDATE associate " +
                   "SET fullName = firstName + " " + lastName")

Even after executing this code, you won’t find any fullName column in your associate database. You need to commit the above changes and make them permanent via the following command:

    cnxn.commit()

In [None]:
cnxn_str = ("Driver={SQL Server Native Client 11.0};" 
            "Server=USXXX00345,67800;" "Database=DB02;" 
            "UID=Alex;" "PWD=Alex123;") 

cnxn = pyodbc.connect("Driver={SQL Server Native Client 11.0};" 
            "Server=USXXX00345,67800;" "Database=DB02;" 
            "UID=Alex;" "PWD=Alex123;")



### 2. Reading from SQL Server

### Python SQLite Integration

    sqlite3.connect('test_database')

### Creating a table in SQLite

In [8]:
import sqlite3

In [9]:
connection = sqlite3.connect('test_database')
mycursor = connection.cursor()
my_cursor

<pymysql.cursors.Cursor at 0x23b5a12d910>

### Creating data in table

### Creating a DataFrame in Pandas

In [None]:
pd.read_sql_query("SELECT*FROM products",conn)