# Introduction to SQL

## SQL (Structured Query Language)

**Goals**:\
    - Databases & Structure.\
    - Extract Information from database using SQL.

### **Relational databases**
1. Define relationships b/t tables of data inside the database. Easy to draw 
    conclusions and get insights from the data.
   (a) Like spreadsheet apps but far more powerful:
       (i) Store more data.
       (ii) More secure: encryption.
       (iii) Many people can use at once
   (b) Query: information is accessed & presented based on instructions
       (i) SQL For Creating, Querying, and Updating relational databases.
       (ii) MySQL is widely used relational database management system.
            - Graphical User Interface (GUI) tool like MySQL Workbench, which 
                can simplify database management and querying.

2. Tables – building blocks of databases.
    (a)	Rows – Records (unlimited), Columns – Fields (# is limited).
    (b)	Manners:
        (i)	Table names: lowercase, no spaces, can be plural.
        (ii) A record is a row in a table, w/ an individual observation.
            - Identifiers: are used to identify records in a table, unique 
                    & often #s.
        (iii) A field is a col in a table, w/ one piece of infor about all
            records. Field name Manners:
            -	Lowercase, No spaces, Singular, Be different from other 
                field names, Be different from the table name.
        (iv) More tables for same data is preferred, since combined table 
            might be less clear & contains duplicate infor.
    (c)	Note: SQL is used to gather & analyze the infor from tables, tables 
        remain separate.

    (d) SQL Data Types for storing different types:
        (i) Each field contains all one data type, since:
            - Different data types stored differently & take up different space.
            - Some opns only apply to certain data types.
        (ii) Strings: VARCHAR is a flexible and can store small or large strs.
        (iii) integer: INT is a SQL integer data type.
        (iv) float: NUMERIC.
    (e) Schemas: design of database, showing tables & their relationships & 
        specifiy datatypes for each field within each table.
    (f) Database Storage:
        (i) Infor like file in a database stored in the Hard disk of a server.
        (ii) Servers are centralized computers that performs services (data access)
        via requests made over a network (people work the same time).
        (iii) Every computer can be set up as a server.
        (iv) Generally huge since need for handle large volume of data and requests.

### **Queries**
1. SQL is used to answer questions, as a complement to other tools s.a. spreadsheet
    apps. Especially for large data w/ complex relationships.
    (a) Uncover relationships, like trends in website traffic, customer revs, and 
        product sales.
    (b) Keywords: reserved words for opns.
        (i) SELECT: select field names by locating a specific table.
            asterisk (*) indicates SELECT all fields of a table.
        (ii) FROM: locate table.
    (c) semicolon indicates a query is complete.
    (d) Result set: Query results. can be saved for collaborators' use.

In [None]:
SELECT *
FROM books;

2. More Keywords:
    (a) Aliasing: 'AS', Rename cols in result set only.

In [None]:
SELECT name AS first_name, year_hired
FROM employees;

    (b) DISTINCT:
        (i) Only display distinct result for a field in one result set.
        (ii) Works for multiple fields, only give unique combinations of fields.

In [None]:
SELECT DISTINCT dept_id, year_hired
FROM employees;

    (c) Views: A view is a virtual table that is the result of a saved SQL SELECT
        statement. (assign a result set to it)
        (i) No result set when creating. 'CREATE VIEW + name + AS'.
        (ii) Automatically update in response to updates in the underlying data.
        (iii) Can be queried like a normal table.

In [None]:
CREATE VIEW employee_hire_years AS
SELECT id, name, year_hired
FROM employees;

### **SQL Flavors/Versions**
1. Free & Paid(Complement to major databases s.a. Microsoft's SQL Server or Oracle Database)
    (a) All used w/ raltional databases.
    (b) Majority of Keywords were shared.
    (c) All must follow universal standards.
    (d) Only the additioinal features on top of these standards make flavors different.
2. Two most popular:
    (a) PostgreSQL
        (i) Free and open-source relational database system.
        (ii) UCB created.
        (iii) refers to both database system and its associated SQL falvor.
    (b) SQL Server
        (i) Free and paid(enterprise) versions.
        (ii) Microsoft created.
        (iii) T-SQL is Microsoft's SQL flavor, used w/ SQL Server databases.


Example: Comparing PostgreSQL and SQL Server 'limit # of results'.
    Difference b/t these flavors is small.

In [None]:
SELECT id, name
FROM employees
LIMIT 2;

In [None]:
SELECT TOP(2) id, name
FROM employees;