# Basic information

Two types of databases: relational and non-relational
 - relational databases: include things like tables, very structured data: stores data in row and columns
 - non-relational: for unstructured data

### 
<div style="display: flex; gap: 10px;">
  <img src="relational.png" width="1200" height="300"/>
  <img src="non_relational.png" width="600" height="300"/>
</div>

<!-- <div style="display: flex; gap: 10px;">
  <img src="relational.png" style="width: 50%; height: auto;"/>
  <img src="non_relational.png" style="width: 20%; height: auto;"/>
</div>
 -->

Fact tables: Contains the core data for business analysis and measure and record business events (e..g job posting)
Dimension tables: describe attributes or dimensions of the data (skills, companies). It also supports filtering, grouping and labeling of facts in reports.

Basic database for SQLite: lukeb.co/sql_jobs_db. <br>
Run: CTRL ENTER or "Run SQL Query" button on the right.<br>
<br>
Order to write commands:  
SELECT  
FROM  
WHERE  
GROUP BY  
HAVING  
ORDER BY ... ASC|DESC  
LIMIT

# Basic Statements and Commands

## Select all columns

SELECT: identifies the columns (or data) from database  
FROM: identifies the table we are connecting to  
*: select all the columns  




SELECT *  
FROM job_posting_fact  


## Select specific Columns: put the items under each other separating them by a comma for better readability

SELECT  
&emsp;company_id,  
&emsp;name  
FROM  
&emsp;company_dim  

There is a better way for that: make a reference to the table:

SELECT  
&emsp;company_dim.company_id,  
&emsp;company_dim.name  
FROM  
&emsp;company_dim  


## Limitation of the outputs  

SELECT  
&emsp; company_dim.company_id,  
&emsp; company_dim.name  
FROM  
&emsp; company_dim  
__LIMIT__ 5  

## Reading different (dictinct) columns only  

__SELECT DISTINCT__  
&emsp; job_postings_fact.job_title_short  
FROM job_postings_fact  

## Running more different SQL queries: one below the other separated by semicolon  

SELECT DISTINCT  
&emsp; job_postings_fact.job_title_short  
FROM job_postings_fact;  

SELECT DISTINCT  
&emsp; job_postings_fact.salary_year_avg  
FROM job_postings_fact;

!Note, currently with that editor only the last result can be shown. With advanced editor later on the others can also be shown

## Filter out particular data from the output

!WHERE is put right after FROM and the keyword is put between '' and not "".

  - e.g.:I am interested only in a job title, where it is "Data Analyst"

SELECT  
    &emsp;job_postings_fact.job_title_short,  
    &emsp;job_postings_fact.job_location,  
    &emsp;job_postings_fact.job_via,  
    &emsp;job_postings_fact.salary_year_avg  
    
FROM  
    &emsp;job_postings_fact  
__WHERE__  
    &emsp;job_title_short = 'Data Analyst'  
LIMIT 10  

  -  e.g.: I am interested in job posts of Data Analysts where the salary is above 90000 per year

SELECT  
    &emsp;job_postings_fact.job_title_short,  
    &emsp;job_postings_fact.job_location,  
    &emsp;job_postings_fact.job_via,  
    &emsp;job_postings_fact.salary_year_avg  
    
FROM  
    &emsp;job_postings_fact  
__WHERE__  
    &emsp;job_title_short = 'Data Analyst' __AND__ salary_year_avg > 90000  
LIMIT 10  

## Adding comments

### Single line comment: --. Comment starts with '--', for strings of one line only, can be put anywhere in the query

-- query to see relevant data for DAs  
SELECT  
    &emsp;job_postings_fact.job_title_short,  
    &emsp;job_postings_fact.job_location,  
    &emsp;job_postings_fact.job_via,  
    &emsp;job_postings_fact.salary_year_avg    
FROM  
    &emsp;job_postings_fact  
WHERE  
    &emsp;job_title_short = 'Data Analyst' AND salary_year_avg > 90000  
LIMIT 10

### Multi-line comment: E.g. I open a Note section at the beginning of the query why I do that specific query
__/*__ multi  
line  
comment  
__*/__

## Ordering (sorting)

### Sort Ascending, from smallest to largest. ! NULL is the smallest if no values

/*  
Line 1 for multi-comment  
something  
Line 3  
*/  
SELECT  
    &emsp;job_postings_fact.job_title_short,  
    &emsp;job_postings_fact.job_location,  
    &emsp;job_postings_fact.job_via,  
    &emsp;job_postings_fact.salary_year_avg    
FROM  
    &emsp;job_postings_fact  
WHERE  
    &emsp;job_title_short = 'Data Analyst'AND salary_year_avg > 90000  
__ORDER BY__  
    &emsp;salary_year_avg  -- you can put here __ASC__, right after the sorting key, but it is not necessary, the default ORDER is ASC  
LIMIT 10  

### Sort Descending

/*  
Line 1 for multi-comment  
something  
Line 3  
*/  
SELECT  
    &emsp;job_postings_fact.job_title_short,  
    &emsp;job_postings_fact.job_location,  
    &emsp;job_postings_fact.job_via,  
    &emsp;job_postings_fact.salary_year_avg    
FROM  
    &emsp;job_postings_fact  
WHERE  
    &emsp;job_title_short = 'Data Analyst'AND salary_year_avg > 90000  
__ORDER BY__  
    &emsp;salary_year_avg __DESC__  
LIMIT 10  

# Comparisons

## Notes:

Used with WHERE and HAVING clause  
Used in conjunction with comparison operators: $=,<>,>,<,<=,>=$  
Used in conjunction with logical operators: __AND,OR,BETWEEN,IN__

## Comparison operators, e.g.: Not equal: select the job_via via not 'via Ai-Jobs.net'

### First solution:  
SELECT  
    &emsp;job_postings_fact.job_id,  
    &emsp;job_postings_fact.job_title_short,  
    &emsp;job_postings_fact.job_via,  
    &emsp;job_postings_fact.salary_year_avg  
FROM   
    &emsp;job_postings_fact  
WHERE  
    &emsp;job_postings_fact.job_via <> 'via Ai-Jobs.net'


### Second solution:  
SELECT  
    &emsp;job_postings_fact.job_id,  
    &emsp;job_postings_fact.job_title_short,  
    &emsp;job_postings_fact.job_via,  
    &emsp;job_postings_fact.salary_year_avg  
FROM   
    &emsp;job_postings_fact  
WHERE __NOT__  
    &emsp;job_postings_fact.job_via = 'via Ai-Jobs.net'

### Note, that double negation works: select the job_via via 'via Ai-Jobs.net'

SELECT  
    &emsp;job_postings_fact.job_id,  
    &emsp;job_postings_fact.job_title_short,  
    &emsp;job_postings_fact.job_via,  
    &emsp;job_postings_fact.salary_year_avg  
FROM   
    &emsp;job_postings_fact  
WHERE NOT  
    &emsp;job_postings_fact.job_via <> 'via Ai-Jobs.net'

## Logical operators

### AND: Salary is more than 100000 per year for Data Analysts:  
SELECT  
    &emsp;job_postings_fact.job_id,  
    &emsp;job_postings_fact.job_title_short,  
    &emsp;job_postings_fact.job_via,  
    &emsp;job_postings_fact.salary_year_avg  
FROM   
    &emsp;job_postings_fact  
WHERE  
    &emsp;job_postings_fact.salary_year_avg > 100000  
    &emsp;__AND__ job_postings_fact.job_title = 'Data Analyst'  
ORDER BY  
    &emsp;job_postings_fact.salary_year_avg

### OR: In case of OR, it is permissive OR ( both conditions are allowed,too).  
List the jobs where it is for Data Scientist OR the salary_year_avg > 100000 (so it can be e.g.: Data Scientist with 120000 salary/year)

SELECT  
    &emsp;job_postings_fact.job_id,  
    &emsp;job_postings_fact.job_title_short,  
    &emsp;job_postings_fact.job_via,  
    &emsp;job_postings_fact.salary_year_avg  
FROM  
    &emsp;job_postings_fact  
WHERE  
    &emsp;job_postings_fact.job_title_short = 'Data Analyst'  
    &emsp;OR job_postings_fact.salary_year_avg > 100000  
ORDER BY  
    &emsp;job_postings_fact.salary_year_avg

### BETWEEN

SELECT  
    &emsp;job_postings_fact.job_id,  
    &emsp;job_postings_fact.job_title_short,  
    &emsp;job_postings_fact.job_via,  
    &emsp;job_postings_fact.salary_year_avg  
FROM  
    &emsp;job_postings_fact  
WHERE  
    &emsp;job_postings_fact.job_title_short = 'Data Analyst'  
    &emsp;OR job_postings_fact.salary_year_avg BETWEEN 60000 AND 90000  
ORDER BY  
    &emsp;job_postings_fact.salary_year_avg

### IN: Match not to a condition, but to any condition in a list: IN (cond1, cond2, .., condn)  
Job titles of Data Analyst or Data Engineer or Data Scientist:  
SELECT  
    &emsp;job_postings_fact.job_id,  
    &emsp;job_postings_fact.job_title_short,  
    &emsp;job_postings_fact.job_via,  
    &emsp;job_postings_fact.salary_year_avg  
FROM  
    &emsp;job_postings_fact  
WHERE  
    &emsp;job_postings_fact.job_title_short __IN__ ('Data Analyst', 'Data Engineer', 'Data Scientist)  