# SQL and modeling foundations
    Data dominates. If you’ve chosen the right data structures
    and organized things well, the algorithms will almost always
    be self-evident. Data structures, not algorithms, are central to
    programming.
    — Rob Pike

## But first... A note on the course

* Databases for developers

* Learning objectives
  * https://github.com/datsoftlyngby/soft2018spring/blob/master/docs/DB_plan.md#learning-objectives

# Learning objectives
## Knowledge
The student must have knowledge of:

 * Various database types and the underlying models
 * A specific database system’s storage organisation  and query execution
 * A specific database system’s optimisation possibilities – including advantages and disadvantages
 * Database-specific security problems and their solutions
 * Concepts and issues when handling big data
 * The particular issues raised by having many simultaneous transactions, including in connection with distributed databases
 * Relational algebra (including its relationship to execution plans)

## Skills
The student can:

 * Transform logical data models into physical models in various database types
 * Implement database optimisation
 * Use parts of the administration tool to assist in the optimisation and tuning of existing databases, including the incorporation of a specific DBMS’ execution plans
 * Use a specific database system’s tools for handling simultaneous transactions
 * Use the programming and other facilities provided by a modern DBMS


## Competencies
The student can:
 
 * Analyse the application domain in order to select a database type
 * Divide responsibility for tasks between the application and DBMS during system development, to ensure the best possible implementation.


## But first... A note on how to learn

* Metacognition
* Dunning-Kruger effect

## Metacognition
  * Think about how you learn
    - Auditory
    - Visual
    - Kinesthetic
    - Tactile

![Donning-Kruger effect](https://i.imgur.com/jbo2gy5.jpg)

# Modeling basics

    Data dominates. If you’ve chosen the right data structures
    and organized things well, the algorithms will almost always
    be self-evident. Data structures, not algorithms, are central to
    programming.
    — Rob Pike

    I will, in fact, claim that the difference between a bad program-
    mer and a good one is whether he considers his code or his data
    structures more important. Bad programmers worry about the
    code. Good programmers worry about data structures and their
    relationships.
    — Linus Torvalds

## What is a model?

* A description of the real world

Four objectives:
  1. Enhance an individual's understanding of the representative system
  2. Facilitate efficient conveyance of system details between stakeholders
  3. Provide a point of reference for system designers to extract system specifications
  4. Document the system for future reference and provide a means for collaboration
  
Source: [Wikipedia](https://en.wikipedia.org/wiki/Conceptual_model)

## Models as thinking tools

* Models are necessary to understand the world
* Models are what we communicate
* Without models, we cannot talk to our clients

## Modeling steps

* Conceptual data model
  * Main objects and goals
  * Little to no detail

* Logical data model
  * Model the data with consideration of the actual system
  * Entity-relationship diagrams (ERD)

* Physical data model
  * Actual implementation in a database
  * Tables, indices, relations etc.

# Focus of this course

* Logical data modeling with ERD
* Physical data modeling with SQL

... Remember the learning objectives:

## Skills
The student can:

 * Transform logical data models into physical models in various database types

# Logical data model
_A logical data model or logical schema is a data model of a specific problem domain expressed independently of a particular database management product or storage technology (physical data model) but in terms of data structures such as relational tables and columns, object-oriented classes, or XML tags. This is as opposed to a conceptual data model, which describes the semantics of an organization without reference to technology._ [Wikipedia](https://en.wikipedia.org/wiki/Logical_data_model)

# Logical data model, take 2

*Abstract diagram of data structure*

  * Entities
    * A customer, a house, a car ...

  * Attributes
    * Customer name, email, phone number...

  * Relationship
    * One customer can order many things
    * One order only concerns one customer

  * Type-instance distinctions
    * A Skoda is an instance of a car

# Entity-reliationship diagram (ERD)

* Entities
* Relationships
* Attributes

# Entity

* Drawn as rectangles

![Entity](images/entity.svg)

# Attributes
* Drawn as circles around entities
![Attributes](images/attribute.svg)

# Relationships

* Drawn with lines between entities
![Simple relationship](images/relationship.svg)

What are we missing?

# Relationships

* Drawn with lines between entities
* Relationships written in diamonds
![Simple relationship](images/relationship_with_name.svg)

What are we missing?

# Relationships

* Drawn with lines between entities
* Relationships written in diamonds
* Cardinality of relationship written near entities
![Simple relationship](images/relationship_with_n.svg)

# ERD example
![ERD example](images/relationship_example.svg)
Source: [Graphviz](https://graphviz.gitlab.io/_pages/Gallery/undirected/ER.html)

# Exercise: ERD

Now your turn. Model a database of music. In it are the following entities:

* Artist
* Album
* Song

Your job is to model the relationship between them using *entities* and *relations*. If you have more time try to work out the *attributes* of your entities.

## Keys

* Keys describe unique accessors for entities

  * Customer
    * <u>name</u>
    * email
    * ...

# Recap: Modeling

* Models as an abstraction
* Logical data model
  * Opposed to the physical data model
* Entity-relationship diagram
  * Entities, relationships, attributes

# Physical data model

*Implementation of a logical data model in a database*

What do we need to implement such a model?

We need a structure. Some way to methodically represent our logical data model.

# Smart structured language<sup>TM</sup> 

We need something that can

* Understand entities
* Understand relations
* Understand attributes
* CRUD on all the above

## Introdinner

  * Either the 8th or 15th 
  * Pick the 8th if you can
  * https://doodle.com/poll/d35za4kpnxf7shqk -- http://bit.ly/2opv3HQ

# Structured query language (SQL)

Domain-specific language built to model relations of tuples.

# Structured query language (SQL)
<div style="float:right; width: 45%"><br/><br/><img alt="SQL" style="width:100%;" src="https://wikimedia.org/api/rest_v1/media/math/render/svg/b0bfef3c941c1a88d3990bd1472653e60cf02d0a" /></div>
* Statements
  * May also be a query
* Clauses
  * ``select``-clause, ``where``-clause etc.
* Expressions
  * ``population + 1``, ``"Boris Jeltsin"``
* Predicates
  * ``name = ’USA‘``

## SQL tables

SQL presupposes you have structured information (tables).

Imagine a table ``actors``:
<table style="font-size:90%">
    <tr>
        <th>name</th><th>age</th>
    </tr>
    <tr><td>Chevy Chase</td><td>75</td></tr>
    <tr><td>Donald Glover</td><td>35</td></tr>
    <tr><td>Danny Pudi</td><td>39</td></tr>
    <tr><td>Ken Jeong</td><td>48</td></tr>
</table>


## SQL statements
<table style="font-size:90%">
    <tr>
        <th>name</th><th>age</th>
    </tr>
    <tr><td>Chevy Chase</td><td>75</td></tr>
    <tr><td>Donald Glover</td><td>35</td></tr>
    <tr><td>Danny Pudi</td><td>39</td></tr>
    <tr><td>Ken Jeong</td><td>48</td></tr>
</table>

<div style="float:right">
    ``UPDATE actors SET age = 76 WHERE name = 'Chevy Chase';``
</div>

<div style="float:right">
    ``DELETE actors WHERE name = 'Chevy Chase';``
</div>

<div style="float:right">
    ``INSERT INTO actors (name, age) VALUES ('Gillian Jacobs', 37);``
</div>

## SQL statement exercise
Write **one** statement that:

* Deletes Ken Jeong, without using his name
* Updates Danny Pudi's name to "Evil Abed"
* Inserts your favourite Community actor (Joel McHale, aged 47)
* Deletes all actors above 40
* Sets the name of everyone to "I am in love with SQL"

## SQL queries

* Queries asks questions about data, but never changes it
* Without a doubt the most used part of SQL

* Contains at a minimum ``SELECT`` and ``FROM``
  * ``SELECT name FROM actors``
  * ``SELECT name, age FROM actors``
  * ``SELECT * FROM actors``

## SQL 
* Can also contain a conditional ``WHERE`` clause:
  * ``SELECT * FROM actors WHERE age > 40``
  * ``SELECT * FROM actors WHERE name != 'Donald Glover'``  (``<>``)


* ``WHERE`` clauses can join logical conditions
  * ``SELECT * FROM actors WHERE age < 10 OR name != "Donald Glover"``
  * ``SELECT * FROM actors WHERE age > 10 AND NOT name = "Donald Glover"``

## SQL wildcard string search

* When searching for strings, ``LIKE`` means that you can partially match.
  * ``SELECT * FROM actors WHERE name LIKE 'Donald Glover'``

* In ``LIKE`` clauses, ``%`` can be used as a wildcard for many characters
  * ``SELECT * FROM actors WHERE name LIKE 'Donald%'``
  * ``SELECT * FROM actors WHERE name LIKE '%y '``
  

* .. and ``_`` for a single character wildcard
  * ``SELECT * FROM actors WHERE name LIKE '_onald Glover'``
  * ``SELECT * FROM actors WHERE name LIKE '_onald%'``

## SQL ORDER BY

You can also order queries:

    SELECT * FROM actors ORDER BY age
  
    SELECT age FROM actors ORDER BY name
    
Useful in connection with ``LIMIT``:

    SELECT * FROM actors ORDER BY age LIMIT 1
    
    SELECT age FROM actors ORDER BY name LIMIT 1'000'000

## SQL queries exercises

Write **one** query that:

* Find the age of all actors
* Finds all actors that is younger than 70 and **not** called "Donald Glover"
* Find the name of the oldest actor that is **not** Chevy Chase
* Sorts all actors by name, except the ones that start with 'K'
* Find all actors that does *not* have an 'e' in the last or second-to last character of their name

## SQL queries and beyond!
* SQL queries can do so much more

* Literature for next week will be about database systems and SQL syntax
  * I expect you to read up on the syntax yourselves!

* ... Please do! Otherwise you will loose time on things you do not have time for!

#### Study activity

  * Read 4 hrs
  * Exercises 4 hrs

# Assignment: ERD of IMDB
For this assignment you will create an ERD from actual data. We will be using the IMDB dataset available from http://www.imdb.com/interfaces/

Your job is to draw (yes, draw) an ERD that includes all the information from the files
``title.basics``, ``title.crew``, ``title.episode``, ``title.principals``, ``title.ratings``, ``name.basics``

Try to avoid redundancy (we'll talk more about this next week)

Hand-in on peergrade.io. And don't forget to review other assignments!