<img src="images/lasalle_logo.png" style="width:375px;height:110px;">

# Week 9 - Basic Structured Query Language

### WIM250 - Introduction to Scripting Languages 
### Instructor: Ivaldo Tributino

Sources:
- Python for Everybody Exploring Data Using Python 3 by Dr. Charles R. Severance.
- [Oracle: what is database](https://www.oracle.com/ca-en/database/what-is-database/)

## 1. What is a database

A `database` is an organized collection of structured information, or data, typically stored electronically in a computer system. 

Data within the most common types of databases in operation today is typically modeled in rows and columns in a series of tables to make processing and data querying efficient. The data can then be easily accessed, managed, modified, updated, controlled, and organized. Most databases use `structured query language (SQL)` for writing and querying data.

`SQL` is a programming language used by nearly all relational databases to query, manipulate, and define data, and to provide access control. SQL was first developed at IBM in the 1970s with Oracle as a major contributor, which led to implementation of the SQL ANSI standard, SQL has spurred many extensions from companies such as IBM, Oracle, and Microsoft.

There are many different database systems which are used for a wide variety of purposes including: Oracle, MySQL, Microsoft SQL Server, PostgreSQL, and SQLite. We focus on `SQLite` in this course because it is a very common database and is already built into Python. `SQLite` is designed to be embedded into other applications to provide database support within the application.

<img src="images/database.png" style="width:100px;height:100px;">

Download [SQLite](https://sqlitebrowser.org/dl/).

## 2. Creating a database table

Defining structure for your data up front may seem inconvenient at the beginning, but the payoff is fast access to your data even when the database contains a large amount of data.

The code to create a database file and a table named `Courses` with three columns, `course_id, course_name and teacher_id`,  is as follows:


In [None]:
import sqlite3

conn = sqlite3.connect('courses.sqlite') # to the database stored in the file courses.sqlite
cur = conn.cursor() # cursor() is very similar conceptually to calling open() when dealing with text files.

#Once we have the cursor, we can begin to execute commands on the contents of
#the database using the execute() method.

# command removes the Courses table from the database if it exists.
cur.execute('DROP TABLE IF EXISTS Courses')
# creates a table named Courses with a text column named course_id, course_named name TEXT and named teacher_id.
cur.execute('CREATE TABLE Courses (course_id TEXT, course_name TEXT, teacher_id INTEGER)') 

conn.close()

You can look at the various data types supported by SQLite at the following url:
http://www.sqlite.org/datatypes.html

As a convention, we will show the `SQL keywords` in uppercase and the parts of the execute command that we are adding (such as the table and column names) will be shown in lowercase.

Now that we have created a table named `Courses`, we can put some data into that table using the `SQL INSERT` operation. Again, we begin by making a connection to the database and obtaining the `cursor`. We can then execute SQL commands using the cursor.

The SQL INSERT command indicates which table we are using and then defines a new row by listing the fields we want to includel. Example

```python
'INSERT INTO Courses (course_id, course_name and teacher_id) VALUES (?, ?, ?)', ('GE091', 'Math Transitional', 3363)
```

In [None]:
import sqlite3

conn = sqlite3.connect('courses.sqlite')
cur = conn.cursor()

cur.execute('INSERT INTO Courses (course_id, course_name, teacher_id) VALUES (?, ?, ?)',
    ('GE091', 'Math Transitional', 3363))
cur.execute('INSERT INTO Courses (course_id, course_name, teacher_id) VALUES (?, ?, ?)',
    ('MTH180', 'Geometry', 3363))
conn.commit() #  to force the data to be written to the database file.

print('Courses:')
cur.execute('SELECT course_id, course_name, teacher_id FROM Courses') # we indicate which columns we would like
for row in cur:
    print(row)

conn.commit()

cur.close()

Now let's open `DB browser for SQLite`, click `Open Database` and look for `courses.sqlite`. Let's look at `Browse Data` and add the row `MTH100, Mathematics, 3363`.

<img src="images/browseData.png" style="width:700px;height:400px;">

Well, we need to know some SQL commands, so let's add two rows using the SQL `INSERT` command in `Execute SQL`:

```SQL
INSERT INTO Courses (id_course, name, id_teacher) VALUES ('WIM250','Introduction to Scripting Languages',3363);
INSERT INTO Courses (id_course, name, id_teacher) VALUES ('WDIM150','Introduction to Scripting Languages',3363)
```

<img src="images/executeSQL.png" style="width:700px;height:400px;">

Cool, huh? We add rows to our database in 3 different ways.

### Let's see two more commands the `SELECT` and `UPDATE`.

```SQL
SELECT * FROM Courses WHERE name = 'Introduction to Scripting Languages'
```
Using * indicates that you want the database to return all of the columns for each row that matches the `WHERE clause`.

**Note, unlike in Python, in a SQL WHERE clause we use a single equal sign to indicate a test for equality rather than a double equal sign. Other logical operations allowed in a WHERE clause include <, >, <=, >=, !=, as well as AND and OR and parentheses to build your logical expressions.**


- You can request that the returned rows be sorted by one of the fields as follows:
```SQL
SELECT name, id_course FROM Courses ORDER BY name
```
- To remove a row, you need a WHERE clause on an SQL DELETE statement. The WHERE clause determines which rows are to be deleted: 

```SQL
DELETE FROM Courses WHERE name = 'blabla'
```

- To DELETE rows that contain NULL, use:
```SQL
DELETE FROM Courses WHERE id_course IS NULL
```
- If you have duplicated lines and would like to DELETE them, you can use this program.

```SQL
DELETE from Courses
WHERE rowid NOT IN (select min(rowid)
                    FROM Courses
                    GROUP BY id_course, name, id_teacher)

```

- It is possible to UPDATE a column or columns within one or more rows in a table using the SQL UPDATE statement as follows:

```SQL
UPDATE Courses SET id_teacher = 5491 WHERE name = 'Introduction to Scripting Languages'
```
The UPDATE statement specifies a table and then a list of fields and values to change after the SET keyword and then an optional WHERE clause to select the rows that are to be updated. A single UPDATE statement will change all of the rows that match the WHERE clause. If a WHERE clause is not specified, it performs the UPDATE on all of the rows in the table.

## 3 Data storage 

Let's take some information from the json file named `daily` and create a database with it. We are interested in the name of the city, the coordinates of the city and the maximum temperature among the max daily temperature in Celsius within the 16 days.

In [None]:
import json

daily_data = [json.loads(line) for line in open('daily_14.json', 'r')] 

In [None]:
print('There are %d distinct cities' %len(daily_data))
print('You can search %d day weather forecast with daily average parameters by city name' %len(daily_data[0]['data']))

Let's see what kind of information is offered for each city. For more information see [forecast16](https://openweathermap.org/forecast16). 

In [None]:
print(json.dumps(daily_data[0], indent = 4, sort_keys=True))

Let's create a class called `daily` with the following attributes:

- city_name
- longitude
- latitude
- max_tem

In [None]:
class daily:
    
    def __init__(self, city_name, longitude, latitude, max_temp):
        
        self.city_name = city_name
        self.longitude = longitude
        self.latitude = latitude
        self.max_temp = max_temp
        
    def __str__(self):
        return f'City: {self.city_name}'    

Now we are going to fill up a dictionary with objects of the newly created class. The keys will be a string as:

```python
'city0' ... 'city22635'
```
and the values the objects of daily class.

In [None]:
dic = {}
for idx, d in enumerate(daily_data):
    
    name = d["city"]['name']
    lon = d["city"]['coord']['lon']
    lat = d["city"]['coord']['lat']
    
    maxm = max([x['temp']['max'] for x in d["data"]])
    maxm = round(maxm - 273.15,2)
    
    dic['city' + str(idx)] = daily(name, lon, lat ,maxm)

In [None]:
print(dic['city134'])
dic['city134'].max_temp

Finally, we will create a table called daily too, we are not very creative.

In [None]:
import sqlite3

conn = sqlite3.connect('daily.sqlite')
cur = conn.cursor()

cur.execute('DROP TABLE IF EXISTS Daily')
cur.execute('CREATE TABLE Daily (city_name TEXT, longitude FLOAT, latitude FLOAT, max_tem FLOAT)') 

for city in dic.values():
    
    cur.execute('INSERT INTO Daily (city_name, longitude, latitude, max_tem) VALUES (?, ?, ?, ?)',
                (city.city_name, city.longitude, city.latitude, city.max_temp))
conn.commit() #  to force the data to be written to the database file.

cur.close()

<img src="images/dailyTable.png" style="width:700px;height:400px;">

In [None]:
import os
os. getcwd()