|
1 | 1 | # SQL Practice Questions |
| 2 | + |
| 3 | +This repository consists of a sample IMDB database along with some sql questions and their solutions. Solving these |
| 4 | +help as a quick sql refresher since these questions cover all the commonly used scenarios. |
| 5 | + |
| 6 | +Sql is the most important and underrated skill for any data scientist and in the feild of machine learning as we |
| 7 | +spend 75% of the time in cleaning and analyzing data. |
| 8 | + |
| 9 | +One of the challenges in the given DB is that the sata is not completely cleaned, please make sure to analyze before |
| 10 | +writing your queries. |
| 11 | + |
| 12 | +Please refer to the solutions if you are stuck anyhwere, i have tried to provide explanations and the thought process |
| 13 | +behind each solution, hopefully that helps and even better if you can come up with a more effecient way of solving. |
| 14 | + |
| 15 | +Please create a pull request if you find a amore effecient way or if there are any corrections needed in the solutions. |
| 16 | + |
| 17 | +### **List of files :** |
| 18 | + |
| 19 | ++ **Database Scema Diagram** - Provides a schematic of all the tables in the database and their realtionships. |
| 20 | ++ **Db-IMDB.db** - Sample IMDB database that we would be using. |
| 21 | ++ **Questions.pdf** - Provides the list of questions to be solved. |
| 22 | ++ **Solutions-Jupyter notebook.ipynb** -- Ipython notebbok with all the solutions. |
| 23 | ++ **Solutions-PDF.pdf** - PDF version of the ipython notebook. |
| 24 | ++ **Solution.sql** - List of all the solutions saved as a sql file. |
| 25 | + |
| 26 | +We would be using python pandas library in a ipython notebook to coonect to the given database and run our sql queries. |
| 27 | +The installation process and how to run queries using pandas can be found below. |
| 28 | + |
| 29 | + |
| 30 | +## Install : |
| 31 | + |
| 32 | +#### **Required Softwares :** |
| 33 | + |
| 34 | ++ **Python 3 :** - Please install the latest version of python 3 from [here](https://www.python.org/downloads/) . At |
| 35 | +the end of the installation don't forget to click add python to path. |
| 36 | ++ **Anaconda** - Anaconda is an open source distribution of python, it consists of all the frequently used python packages |
| 37 | +that we need. Install it from [here](https://www.anaconda.com/distribution/) . Please choose the python 3 version. |
| 38 | + |
| 39 | +That's it , you should have all the softwares you need to run. |
| 40 | + |
| 41 | +If you have never used a jupyter notebook, don't worry it's pretty straight forward, you can find a quick overview |
| 42 | +[here](https://www.youtube.com/watch?v=HW29067qVWk). |
| 43 | + |
| 44 | +## Steps to connect to the database using pandas : |
| 45 | + |
| 46 | ++ Create a new hupyter notebook, preferably in the same folder where you put the Db-IMDB.db file. |
| 47 | ++ We need to import couple of libraries pandas and sqllite3. |
| 48 | +``` |
| 49 | + import pandas as pd |
| 50 | + import sqlite3 as sql # included as part of python standard library |
| 51 | +``` |
| 52 | ++ Make a coonection to the sample imdb database. |
| 53 | +``` |
| 54 | + conn = sql.connect("Db-IMDB.db") |
| 55 | +``` |
| 56 | ++ Once we have the connection we can use pandas to write sql queries and see the results.The below query gives all the tables in |
| 57 | +the database |
| 58 | +``` |
| 59 | + result = pd.read_sql_query("SELECT * FROM sqlite_master where type = 'table';" ) |
| 60 | + print(result) |
| 61 | +``` |
| 62 | + |
| 63 | +## Contributions : |
| 64 | + |
| 65 | +The sample IMDB database and the questions are provided by the [Applied AI Team](https://www.appliedaicourse.com/) as part of their |
| 66 | +machine learning course. |
0 commit comments