**NOTE: Distributing or uploading this course material to a public repository (e.g., GitHub) is strictly prohibited.**

## **Database Creation**

We want to implement an OpenCourseWare service using SQLite3. Continuing from the first and second mini-projects, we will add the functionality for user management.

The given 'create_database_proj3.sql' file contains the definition of the schema and a sample set of the tuples. You can download it from [here](https://drive.google.com/file/d/1_hp8vix4Y15VXSmH3ZMp9JuSV66T7FG8/view?usp=share_link).

The database contains the following tables:
* *kmooc_user*
* *kmooc_developer*
* *kmooc_tech*
* *kmooc_developer_techs*

Please see the file for detailed schema. Note that you need to use 'create_database_proj**3**.sql'.

We will implement several core modules of the service.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import sqlite3

conn = sqlite3.connect('kmooc.sqlite3')
cur = conn.cursor()

# Enter your solution here
# Execute the SQL statements in 'create_database_proj3.sql'.
# If you placed this file in the 'MyDrive' folder, its path is '/content/drive/MyDrive/create_database_proj3.sql'.
f = open("/content/drive/MyDrive/create_database_proj3.sql", 'r')
sql_file = f.read()
f.close()
sql_command = sql_file.split(';')
for command in sql_command:
  cur.execute(command)


conn.commit()
conn.close()

## **Module 1**

Create a query that will join the data from the `"kmooc_user"` table with the `"kmooc_developer"` table (INNER JOIN).

The output table should contain the following columns:
* *first_name* column from the `"kmooc_user"` table
* *last_name* column from the `"kmooc_user"` table
* *level* column from the `"kmooc_developer"` table

Sort the output table in the ascending order of the *first_name* column. (The first name and the last name are not concatenated throughout this notebook.)

**Format**:
>```
('Daniel', 'Harris', 'junior')
...
```

Print the result using the code cell below.

In [None]:
import sqlite3

conn = sqlite3.connect('kmooc.sqlite3')
cur = conn.cursor()

# Enter your solution here
cur.execute("""
SELECT first_name, last_name, level
FROM kmooc_user
INNER JOIN kmooc_developer
ON id = user_id
ORDER BY first_name ASC;
""")
for row in cur.fetchall():
  print(row)

conn.commit()
conn.close()

('Carol', 'Horn', 'mid')
('Daniel', 'Harris', 'junior')
('Daniel', 'Thompson', 'mid')
('James', 'Simons', 'senior')
('John', 'Taylor', 'senior')
('Michaela', 'Garrison', 'senior')
('Paula', 'Burke', 'junior')
('Sharon', 'Johnson', 'junior')
('Shelly', 'Hudson', 'senior')
('William', 'Lopez', 'mid')


## **Module 2**

Create a query that will join the tables `"kmooc_user"`, `"kmooc_developer"`, `"kmooc_tech"`, and `"kmooc_developer_techs"` (all INNER JOIN). Note that a developer with multiple techniques will appear multiple times.

The output table should contain the following columns:
* *first_name* column from the `"kmooc_user"` table
* *last_name* column from the `"kmooc_user"` table
* *level* column from the `"kmooc_developer"` table
* *name* column from the `"kmooc_tech"` table, aliased as `"tech_name"`

Sort the output table in the ascending order of the *first_name* column. Limit the result to the first 10 records.

**Format**:
>```
('Daniel', 'Harris', 'junior', 'python')
...
```

Print the result using the code cell below.

In [None]:
import sqlite3

conn = sqlite3.connect('kmooc.sqlite3')
cur = conn.cursor()

# Enter your solution here
cur.execute("""
SELECT first_name, last_name, level, name "tech_name"
FROM kmooc_user AS u
INNER JOIN kmooc_developer AS d
ON u.id = d.user_id
INNER JOIN kmooc_tech AS t
INNER JOIN kmooc_developer_techs AS dt
ON dt.developer_id = d.user_id and tech_id = t.id
ORDER BY first_name ASC
LIMIT 10;
""")
for row in cur.fetchall():
  print(row)

conn.commit()
conn.close()

('Carol', 'Horn', 'mid', 'flutter')
('Carol', 'Horn', 'mid', 'dart')
('Carol', 'Horn', 'mid', 'git')
('Carol', 'Horn', 'mid', 'linux')
('Daniel', 'Harris', 'junior', 'python')
('Daniel', 'Harris', 'junior', 'html')
('Daniel', 'Harris', 'junior', 'css')
('Daniel', 'Thompson', 'mid', 'python')
('Daniel', 'Thompson', 'mid', 'html')
('Daniel', 'Thompson', 'mid', 'css')


## **Module 3**

We want to extract the number of known techniques for each developer. Create a query that will join the tables `"kmooc_user"`, `"kmooc_developer"`, `"kmooc_tech"`, and `"kmooc_developer_techs"` (all INNER JOIN).

The output table should contain the following columns:
* *first_name* column from the `"kmooc_user"` table
* *last_name* column from the `"kmooc_user"` table
* *level* column from the `"kmooc_developer"` table
* *num_techs* column - number of techniques for each developer

Sort the output table in the descending order of the *num_techs* column and then in the ascending order of the *first_name* column.

**Format**:
>```
('William', 'Lopez', 'mid', 6)
...
```

Print the result using the code cell below.

In [None]:
import sqlite3

conn = sqlite3.connect('kmooc.sqlite3')
cur = conn.cursor()

# Enter your solution here
cur.execute("""
SELECT first_name, last_name, level, count(name) num_techs
FROM kmooc_user AS u
INNER JOIN kmooc_developer AS d
ON u.id = d.user_id
INNER JOIN kmooc_tech AS t
INNER JOIN kmooc_developer_techs AS dt
ON dt.developer_id = d.user_id and tech_id = t.id
GROUP BY first_name, last_name
ORDER BY num_techs DESC, first_name ASC;
""")
for row in cur.fetchall():
  print(row)

conn.commit()
conn.close()

('William', 'Lopez', 'mid', 6)
('Daniel', 'Thompson', 'mid', 5)
('Paula', 'Burke', 'junior', 5)
('Carol', 'Horn', 'mid', 4)
('James', 'Simons', 'senior', 4)
('Sharon', 'Johnson', 'junior', 4)
('Daniel', 'Harris', 'junior', 3)
('John', 'Taylor', 'senior', 3)
('Michaela', 'Garrison', 'senior', 3)
('Shelly', 'Hudson', 'senior', 3)


## **Module 4**

We want to extract the number of developers for each level of experience: *junior*, *mid*, *senior*.

The output table should contain the following columns:
* *level* column from the `"kmooc_developer"` table
* *num_developers* column - number of developers for a given level of experience

Sort the output table in the descending order of the *num_developers* column.

**Format**:
>```
('junior', 8)
...
```

Print the result using the code cell below.

In [None]:
import sqlite3

conn = sqlite3.connect('kmooc.sqlite3')
cur = conn.cursor()

# Enter your solution here
cur.execute("""
SELECT level, count(level) num_developers
FROM kmooc_developer
GROUP BY level
ORDER BY num_developers DESC;
""")
for row in cur.fetchall():
  print(row)

conn.commit()
conn.close()

('senior', 4)
('junior', 3)
('mid', 3)


## **Module 5**

We want to extract the number of developers for each technique. Create a query that will join the tables `"kmooc_developer_techs"` and  `"kmooc_tech"` (INNER JOIN). Note that a developer having multiple techniques is counted multiple times.

The output table should contain the following columns:
* *name* column from the `"kmooc_tech"` table
* *num_developers* column - number of developers for a given technique

Sort the output table in the descending order of the *num_developers* column.

**Format**:
>```
('python', 8)
...
```

Print the result using the code cell below.

In [None]:
import sqlite3

conn = sqlite3.connect('kmooc.sqlite3')
cur = conn.cursor()

# Enter your solution here
cur.execute("""
SELECT name, count(tech_id) num_developers
FROM kmooc_developer_techs AS dt
INNER JOIN kmooc_tech AS t
ON dt.tech_id = t.id
GROUP BY tech_id
ORDER BY num_developers DESC;
""")
for row in cur.fetchall():
  print(row)

conn.commit()
conn.close()

('html', 5)
('git', 5)
('python', 4)
('css', 4)
('javascript', 3)
('django', 3)
('java', 2)
('sql', 2)
('flutter', 2)
('dart', 2)
('linux', 2)
('c++', 1)
('c#', 1)
('unity', 1)
('testing', 1)
('swift', 1)
('kotlin', 1)


## **Module 6**

We want to find the users who have **not** registered as a developer. Create a query that lists such non-developer users in the database.

The output table should contain the following columns:
* *first_name* column from the `"kmooc_user"` table
* *last_name* column from the `"kmooc_user"` table

Sort the output table in the ascending order of the *first_name* column.

**Format**:
>```
('Lisa', 'Johnson')
...
```

Print the result using the code cell below.

In [None]:
import sqlite3

conn = sqlite3.connect('kmooc.sqlite3')
cur = conn.cursor()

# Enter your solution here
cur.execute("""
SELECT first_name, last_name
FROM kmooc_user
LEFT JOIN kmooc_developer
ON user_id = id
WHERE user_id is NULL
ORDER BY first_name ASC;
""")
for row in cur.fetchall():
  print(row)

conn.commit()
conn.close()

('Joshua', 'Brown')
('Lisa', 'Johnson')
('Mason', 'Robinson')
