# Ex 5.3  Retrieving Data from a Remote  *AdventureWorks* SQL Server database 

- [In Class](#In-Class)  
  - Display Information about the Tables in the  **AdventureWorks** SQL Server Database  
    - Display all the Table names  
    - Display the number of different tables in each of the Schemas  
    - Display all the Tables in the *Demo* Schema
    
  
  


- [Ex 5.3](#Ex-5.3)  
  - Cleanup the df_Tables Dataframe and display   
  - Display all the Views in the SQL Server Database  
  - Part 1- 7
  


In [2]:
import pandas as pd
import pyodbc  

# In Class  
- Connect to the **AdventureWorks** database and retrieve the table names  
- Display the different Schemas  
- Display the number of different tables in each of the schemas  


**Credentials**  
- Server:   
- Database:   
- User ID:   
- Password:   


- Note: These credentials will only work when run on vcat!!  
  - That's how the DBA set it up -- for security reasons  

### Build Connection Object

In [3]:
# Set connection values to variables
driver = "DRIVER={SQL Server Native Client 11.0};" 
server = "Server=;" 
database = "Database=;" 
userid =   "uid=;" 
password = "pwd=;"
      
# Build connection string  
conn_string = driver + server + database + userid + password

# Create Connection object
conn = pyodbc.connect(conn_string)

# Display Information about the Tables in the  *AdventureWorks* SQL Server Database  

### Display All the Table Names

In [4]:
# Build the SQL String to retrieve database table info
sql_string = 'select * from INFORMATION_SCHEMA.TABLES'

# Execute the SQL query and fill a datframe with the retrieved data
df_Tables = pd.read_sql_query(sql_string, conn) 
df_Tables.head()


Unnamed: 0,TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,TABLE_TYPE
0,AdventureWorks,Sales,SalesTaxRate,BASE TABLE
1,AdventureWorks,Sales,PersonCreditCard,BASE TABLE
2,AdventureWorks,Person,PersonPhone,BASE TABLE
3,AdventureWorks,Sales,SalesTerritory,BASE TABLE
4,AdventureWorks,Sales,SpecialOfferProduct_inmem,BASE TABLE


### Display the number of different tables in each of the schemas

In [5]:
sum_tables_in_schemas = df_Tables.groupby('TABLE_SCHEMA')['TABLE_NAME'].count()
print("The number of Database Tables in Each of The Database Schemas are: ", sum_tables_in_schemas)

The number of Database Tables in Each of The Database Schemas are:  TABLE_SCHEMA
Demo               2
HumanResources    15
Person            18
Production        30
Purchasing         7
Sales             38
dbo                3
Name: TABLE_NAME, dtype: int64


### Display all theTables in the *Demo* Schema

In [6]:
# Display the Demo schema tables
filter = df_Tables['TABLE_SCHEMA'] == 'Demo'
df_Tables[filter]

Unnamed: 0,TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,TABLE_TYPE
87,AdventureWorks,Demo,DemoSalesOrderDetailSeed,BASE TABLE
90,AdventureWorks,Demo,DemoSalesOrderHeaderSeed,BASE TABLE


### Retrieve and Display the data in the *Demo.DemoSalesOrderDetailSeed* table

In [7]:
# Build SQL String to Display only the Views
sql_string = "select * from Demo.DemoSalesOrderDetailSeed" 
# Execute the query and display all the rows in the dataframe that holds the retrieved data
df_OrderDetailSeed = pd.read_sql_query(sql_string, conn)
df_OrderDetailSeed.head()

Unnamed: 0,OrderQty,ProductID,SpecialOfferID,OrderID,LocalID
0,2,680,1,1,1
1,3,706,1,2,2
2,4,707,1,3,3
3,5,708,1,4,4
4,6,709,1,5,5


# Ex 5.3  


# Cleanup the *df_Tables* Dataframe  and Display  
- Since they're all Base Tables, drop the TABLE_TYPE column 
- Since they're all AdventureWorks, drop that column
- Sort by table schema and table name  
- Display the different table schema names  
- The Demo schema tables

In [8]:
# Drop the TABLE_TYPE column
df_Tables.drop('TABLE_TYPE', axis='columns', inplace=True)

df_Tables.head(2)

Unnamed: 0,TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME
0,AdventureWorks,Sales,SalesTaxRate
1,AdventureWorks,Sales,PersonCreditCard


In [9]:
# Drop the TABLE_CATALOG column
df_Tables.drop('TABLE_CATALOG', axis='columns', inplace=True)
df_Tables.head(2)

Unnamed: 0,TABLE_SCHEMA,TABLE_NAME
0,Sales,SalesTaxRate
1,Sales,PersonCreditCard


In [10]:
#  Sort by table schema and table name
df_Tables.sort_values('TABLE_NAME', inplace=True, ascending=False)

# Display the sorted df_Tables dataframe
df_Tables.head()

Unnamed: 0,TABLE_SCHEMA,TABLE_NAME
77,Purchasing,vVendorWithContacts
79,Purchasing,vVendorWithAddresses
72,Sales,vStoreWithDemographics
74,Sales,vStoreWithContacts
76,Sales,vStoreWithAddresses


# Display all the *Views* in the  *AdventureWorks* SQL Server Database  

In [14]:
# Build sql string to retrieve a list of the Views
sql_string = 'select * from INFORMATION_SCHEMA.VIEWS'

# Execute SQL query and retrieve data into a dataframe
df_Views = pd.read_sql_query(sql_string, conn)
df_Views.head()

Unnamed: 0,TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,VIEW_DEFINITION,CHECK_OPTION,IS_UPDATABLE
0,AdventureWorks,Person,vAdditionalContactInfo,\r\nCREATE VIEW [Person].[vAdditionalContactIn...,NONE,NO
1,AdventureWorks,HumanResources,vEmployee,\r\nCREATE VIEW [HumanResources].[vEmployee] \...,NONE,NO
2,AdventureWorks,HumanResources,vEmployeeDepartment,\r\nCREATE VIEW [HumanResources].[vEmployeeDep...,NONE,NO
3,AdventureWorks,HumanResources,vEmployeeDepartmentHistory,\r\nCREATE VIEW [HumanResources].[vEmployeeDep...,NONE,NO
4,AdventureWorks,Sales,vIndividualCustomer,\r\nCREATE VIEW [Sales].[vIndividualCustomer] ...,NONE,NO


# Part 1 - 7  
Retrieve and display the top five rows (and Number of Rows and Columns) of the following Views:  

# 1. Sales.vIndividualCustomer  


In [19]:
sql_string = "select * from Sales.vIndividualCustomer"

df_Customer = pd.read_sql_query(sql_string, conn)

print("Number of rows: ", df_Customer.shape[0])
print("Number of columns: ", df_Customer.shape[1])

df_Customer.head()

Number of rows:  18508
Number of columns:  18


Unnamed: 0,BusinessEntityID,Title,FirstName,MiddleName,LastName,Suffix,PhoneNumber,PhoneNumberType,EmailAddress,EmailPromotion,AddressType,AddressLine1,AddressLine2,City,StateProvinceName,PostalCode,CountryRegionName,Demographics
0,1699,Mr.,David,R.,Robinett,,238-555-0100,Home,david22@adventure-works.com,1,Home,Pappelallee 6667,,Solingen,Nordrhein-Westfalen,42651,Germany,"<IndividualSurvey xmlns=""http://schemas.micros..."
1,1700,Ms.,Rebecca,A.,Robinson,,648-555-0100,Cell,rebecca3@adventure-works.com,0,Home,1861 Chinquapin Ct,,Seaford,Victoria,3198,Australia,"<IndividualSurvey xmlns=""http://schemas.micros..."
2,1701,Ms.,Dorothy,B.,Robinson,,423-555-0100,Cell,dorothy3@adventure-works.com,2,Home,4693 Mills Dr.,,Geelong,Victoria,3220,Australia,"<IndividualSurvey xmlns=""http://schemas.micros..."
3,1702,Ms.,Carol Ann,F.,Rockne,,439-555-0100,Cell,carolann0@adventure-works.com,0,Home,1312 Skycrest Drive,,Lancaster,England,LA1 1LN,United Kingdom,"<IndividualSurvey xmlns=""http://schemas.micros..."
4,1703,Mr.,Scott,M.,Rodgers,,989-555-0100,Cell,scott10@adventure-works.com,0,Home,9860 Brookview Drive,,East Brisbane,Queensland,4169,Australia,"<IndividualSurvey xmlns=""http://schemas.micros..."


# 2. Sales.vPersonDemographics  


In [20]:
sql_string = "select * from Sales.vPersonDemographics"

df_Person_Demographics = pd.read_sql_query(sql_string, conn)

print("Number of rows: ", df_Person_Demographics.shape[0])
print("Number of rows: ", df_Person_Demographics.shape[1])

df_Person_Demographics.head()

Number of rows:  19972
Number of rows:  13


Unnamed: 0,BusinessEntityID,TotalPurchaseYTD,DateFirstPurchase,BirthDate,MaritalStatus,YearlyIncome,Gender,TotalChildren,NumberChildrenAtHome,Education,Occupation,HomeOwnerFlag,NumberCarsOwned
0,1,0.0,NaT,NaT,,,,,,,,,
1,2,0.0,NaT,NaT,,,,,,,,,
2,3,0.0,NaT,NaT,,,,,,,,,
3,4,0.0,NaT,NaT,,,,,,,,,
4,5,0.0,NaT,NaT,,,,,,,,,


# 3. Sales.vSalesPerson  


In [21]:
sql_string = "select * from Sales.vSalesPerson"

df_Person = pd.read_sql_query(sql_string, conn)

print("Number of rows: ", df_Person.shape[0])
print("Number of rows: ", df_Person.shape[1])

df_Person.head()

Number of rows:  17
Number of rows:  22


Unnamed: 0,BusinessEntityID,Title,FirstName,MiddleName,LastName,Suffix,JobTitle,PhoneNumber,PhoneNumberType,EmailAddress,...,AddressLine2,City,StateProvinceName,PostalCode,CountryRegionName,TerritoryName,TerritoryGroup,SalesQuota,SalesYTD,SalesLastYear
0,274,,Stephen,Y,Jiang,,North American Sales Manager,238-555-0197,Cell,stephen0@adventure-works.com,...,,Redmond,Washington,98052,United States,,,,559697.6,0.0
1,275,,Michael,G,Blythe,,Sales Representative,257-555-0154,Cell,michael9@adventure-works.com,...,,Detroit,Michigan,48226,United States,Northeast,North America,300000.0,3763178.0,1750406.0
2,276,,Linda,C,Mitchell,,Sales Representative,883-555-0116,Work,linda3@adventure-works.com,...,,Nevada,Utah,84407,United States,Southwest,North America,250000.0,4251369.0,1439156.0
3,277,,Jillian,,Carson,,Sales Representative,517-555-0117,Work,jillian0@adventure-works.com,...,,Duluth,Minnesota,55802,United States,Central,North America,250000.0,3189418.0,1997186.0
4,278,,Garrett,R,Vargas,,Sales Representative,922-555-0165,Work,garrett1@adventure-works.com,...,,Calgary,Alberta,T2P 2G8,Canada,Canada,North America,250000.0,1453719.0,1620277.0


# 4. Sales.vSalesPersonSalesByFiscalYears  


In [22]:
sql_string = "select * from Sales.vSalesPerson"

df_Years = pd.read_sql_query(sql_string, conn)

print("Number of rows: ", df_Years.shape[0])
print("Number of rows: ", df_Years.shape[1])

df_Years.head()

Number of rows:  17
Number of rows:  22


Unnamed: 0,BusinessEntityID,Title,FirstName,MiddleName,LastName,Suffix,JobTitle,PhoneNumber,PhoneNumberType,EmailAddress,...,AddressLine2,City,StateProvinceName,PostalCode,CountryRegionName,TerritoryName,TerritoryGroup,SalesQuota,SalesYTD,SalesLastYear
0,274,,Stephen,Y,Jiang,,North American Sales Manager,238-555-0197,Cell,stephen0@adventure-works.com,...,,Redmond,Washington,98052,United States,,,,559697.6,0.0
1,275,,Michael,G,Blythe,,Sales Representative,257-555-0154,Cell,michael9@adventure-works.com,...,,Detroit,Michigan,48226,United States,Northeast,North America,300000.0,3763178.0,1750406.0
2,276,,Linda,C,Mitchell,,Sales Representative,883-555-0116,Work,linda3@adventure-works.com,...,,Nevada,Utah,84407,United States,Southwest,North America,250000.0,4251369.0,1439156.0
3,277,,Jillian,,Carson,,Sales Representative,517-555-0117,Work,jillian0@adventure-works.com,...,,Duluth,Minnesota,55802,United States,Central,North America,250000.0,3189418.0,1997186.0
4,278,,Garrett,R,Vargas,,Sales Representative,922-555-0165,Work,garrett1@adventure-works.com,...,,Calgary,Alberta,T2P 2G8,Canada,Canada,North America,250000.0,1453719.0,1620277.0


# 5. Sales.vStoreWithAddresses  


In [25]:
sql_string = "select * from Sales.vStoreWithAddresses"

df_Addresses = pd.read_sql_query(sql_string, conn)

print("Number of rows: ", df_Addresses.shape[0])
print("Number of rows: ", df_Addresses.shape[1])

df_Addresses.head()

Number of rows:  712
Number of rows:  9


Unnamed: 0,BusinessEntityID,Name,AddressType,AddressLine1,AddressLine2,City,StateProvinceName,PostalCode,CountryRegionName
0,956,Finer Sales and Service,Main Office,#500-75 O'Connor Street,,Ottawa,Ontario,K4B 1S2,Canada
1,780,Finer Riding Supplies,Main Office,#9900 2700 Production Way,,Burnaby,British Columbia,V5A 4X1,Canada
2,1012,Stylish Department Stores,Main Office,1 Corporate Center Drive,,Miami,Florida,33127,United States
3,482,Favorite Toy Distributor,Main Office,"1, place de la République",,Paris,Seine (Paris),75017,France
4,1338,Sports Sales and Rental,Main Office,100 Fifth Drive,,Millington,Tennessee,38054,United States


# 6. Sales.vStoreWithContacts  

In [26]:
sql_string = "select * from Sales.vStoreWithContacts"

df_Contacts = pd.read_sql_query(sql_string, conn)

print("Number of rows: ", df_Contacts.shape[0])
print("Number of rows: ", df_Contacts.shape[1])

df_Contacts.head()

Number of rows:  753
Number of rows:  12


Unnamed: 0,BusinessEntityID,Name,ContactType,Title,FirstName,MiddleName,LastName,Suffix,PhoneNumber,PhoneNumberType,EmailAddress,EmailPromotion
0,292,Next-Door Bike Store,Owner,Mr.,Gustavo,,Achong,,398-555-0132,Cell,gustavo0@adventure-works.com,2
1,294,Professional Sales and Service,Owner,Ms.,Catherine,R.,Abel,,747-555-0171,Cell,catherine0@adventure-works.com,1
2,296,Riders Company,Owner,Ms.,Kim,,Abercrombie,,334-555-0137,Work,kim2@adventure-works.com,0
3,298,The Bike Mechanics,Owner,Sr.,Humberto,,Acevedo,,599-555-0127,Cell,humberto0@adventure-works.com,2
4,300,Nationwide Supply,Owner,Sra.,Pilar,,Ackerman,,1 (11) 500 555-0132,Cell,pilar1@adventure-works.com,0


# 7. Sales.vStoreWithDemographics

In [27]:
sql_string = "select * from Sales.vStoreWithDemographics"

df_Store_Demographics = pd.read_sql_query(sql_string, conn)

print("Number of rows: ", df_Store_Demographics.shape[0])
print("Number of rows: ", df_Store_Demographics.shape[1])

df_Store_Demographics.head()

Number of rows:  701
Number of rows:  12


Unnamed: 0,BusinessEntityID,Name,AnnualSales,AnnualRevenue,BankName,BusinessType,YearOpened,Specialty,SquareFeet,Brands,Internet,NumberEmployees
0,314,Top of the Line Bikes,1500000.0,150000.0,International Bank,OS,1986,Touring,40000,4+,DSL,46
1,316,Fun Toys and Bikes,300000.0,30000.0,Primary Bank & Reserve,BM,1973,Touring,6000,2,DSL,2
2,318,Great Bikes,300000.0,30000.0,International Security,BM,1981,Mountain,10000,4+,T3,3
3,320,Metropolitan Sales and Rental,300000.0,30000.0,Guardian Bank,BM,1976,Road,6000,AW,T1,4
4,322,Irregulars Outlet,300000.0,30000.0,Primary International,BM,1984,Mountain,7000,AW,DSL,5
