Skip to content

Commit 3773bea

Browse files
committed
Updated the Readme file
1 parent 4f4532e commit 3773bea

1 file changed

Lines changed: 65 additions & 0 deletions

File tree

README.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,66 @@
11
# SQL Practice Questions
2+
3+
This repository consists of a sample IMDB database along with some sql questions and their solutions. Solving these
4+
help as a quick sql refresher since these questions cover all the commonly used scenarios.
5+
6+
Sql is the most important and underrated skill for any data scientist and in the feild of machine learning as we
7+
spend 75% of the time in cleaning and analyzing data.
8+
9+
One of the challenges in the given DB is that the sata is not completely cleaned, please make sure to analyze before
10+
writing your queries.
11+
12+
Please refer to the solutions if you are stuck anyhwere, i have tried to provide explanations and the thought process
13+
behind each solution, hopefully that helps and even better if you can come up with a more effecient way of solving.
14+
15+
Please create a pull request if you find a amore effecient way or if there are any corrections needed in the solutions.
16+
17+
### **List of files :**
18+
19+
+ **Database Scema Diagram** - Provides a schematic of all the tables in the database and their realtionships.
20+
+ **Db-IMDB.db** - Sample IMDB database that we would be using.
21+
+ **Questions.pdf** - Provides the list of questions to be solved.
22+
+ **Solutions-Jupyter notebook.ipynb** -- Ipython notebbok with all the solutions.
23+
+ **Solutions-PDF.pdf** - PDF version of the ipython notebook.
24+
+ **Solution.sql** - List of all the solutions saved as a sql file.
25+
26+
We would be using python pandas library in a ipython notebook to coonect to the given database and run our sql queries.
27+
The installation process and how to run queries using pandas can be found below.
28+
29+
30+
## Install :
31+
32+
#### **Required Softwares :**
33+
34+
+ **Python 3 :** - Please install the latest version of python 3 from [here](https://www.python.org/downloads/) . At
35+
the end of the installation don't forget to click add python to path.
36+
+ **Anaconda** - Anaconda is an open source distribution of python, it consists of all the frequently used python packages
37+
that we need. Install it from [here](https://www.anaconda.com/distribution/) . Please choose the python 3 version.
38+
39+
That's it , you should have all the softwares you need to run.
40+
41+
If you have never used a jupyter notebook, don't worry it's pretty straight forward, you can find a quick overview
42+
[here](https://www.youtube.com/watch?v=HW29067qVWk).
43+
44+
## Steps to connect to the database using pandas :
45+
46+
+ Create a new hupyter notebook, preferably in the same folder where you put the Db-IMDB.db file.
47+
+ We need to import couple of libraries pandas and sqllite3.
48+
```
49+
import pandas as pd
50+
import sqlite3 as sql # included as part of python standard library
51+
```
52+
+ Make a coonection to the sample imdb database.
53+
```
54+
conn = sql.connect("Db-IMDB.db")
55+
```
56+
+ Once we have the connection we can use pandas to write sql queries and see the results.The below query gives all the tables in
57+
the database
58+
```
59+
result = pd.read_sql_query("SELECT * FROM sqlite_master where type = 'table';" )
60+
print(result)
61+
```
62+
63+
## Contributions :
64+
65+
The sample IMDB database and the questions are provided by the [Applied AI Team](https://www.appliedaicourse.com/) as part of their
66+
machine learning course.

0 commit comments

Comments
 (0)