# "SQL is fun!"
> "A dive into a friendly world of SQL."

- toc: true 
- badges: false
- comments: false
- categories: [sql] 
- image: images/chart-preview.png

## SQL in action

```TODO: use case demo + some intro```

## Building your first end-to-end project

Now that you saw an example of data project using SQL, you'll see how to build such project yourself :rocket:

### Download the data

You can't do analytics without data. How you actually get it, depends on your project. When working on a business project, you may find the crucial datasets in a database of your company. That's quite convenient as you don't have to prepare and upload the data yourself.

Eventually you may stumble upon a limitations of the data at your disposal. Or you may want to work on your personal project, where you need to create your dataset from scratch. In such cases, ability to explore and use publically available datasets can be a very useful skill. There's no golden rule, but you might want to check sites like [Kaggle](https://www.kaggle.com/datasets), [this repo](https://github.com/awesomedata/awesome-public-datasets), [Dataset Search by Google](https://datasetsearch.research.google.com/) or [this subreddit](https://www.reddit.com/r/datasets/) looking for the right data for you.

In our example, we'll use the data on COVID-19 vaccinations shared by Our World in Data. You can find it in [this repo](https://github.com/owid/covid-19-data/tree/master/public/data/vaccinations). To download the data, follow the steps:
- visit [this link](https://github.com/owid/covid-19-data/blob/master/public/data/vaccinations/vaccinations.csv),
- click on the little `Raw` button in the upper right corner,
- when you see a comma delimited text right-click on top of it and select `Save as..` option to save the file on your computer.
```TODO: screnshoots to bullets above?```
```TODO: you might be wondering if there is a better way to download this file, clone repo? curl?```

Congratulations, you've just completed the first step to get your data project up and running!

### Choose the database

Do you remember the cult movie from 1999 "The Matrix"? There was this very memorable scene where Neo was given a choice between two pills, the red one and the blue one. He had a one in a lifetime opportunity to decide whether he wants to go on living in lie or find out the truth.

There are two remarkable ascpects of this decision. First, one of the options feels to be objectively more "correct". No one wants his life to be a lie. Secondly, after making this decision, there was no way back, which made it an extremely difficult choice.

At this point of the book, you also need to make a decision. There are many different database solutions and many different SQL flavours. This book covers couple of them, so you need to pick the one that you'll move on with. Compared to the dilemma that Neo faced, the decision that you are going to make is both easier and more difficult at the same time.

It's more difficult, because there is no answer which is more "correct" than the others. All solutions discussed in this book are widely used. All of them have advantages and disadvantages and there is no "the right choice". Fortunately, unlike the Neo, you don't need to live with this decision for the rest of your life. Quite the opposite! You can start with one tool and later on come back to try another one.

It gets even better than that. There is a standard that describes the principles of the SQL language. It means that even if you learn one implementation of the SQL (e.g. SQLite) and want to switch to another one (e.g. PostgreSQL), you should be able to transfer most of your knowledge. To be clear, it doesn't mean that each vendor implements the language in an identical way. They try to follow the standard to some degree but there are some differences. Those discrepancie however are reasonable enough to make your knowledge universal most of the time.

Let's now start working on the database installation. In this book we use SQLite as a default option to show you the process. If you want to see instructions for other databases, visit one of the links:
- Big Query (todo: in the backlog): [link](https://creaitive.studio/sql/2021/03/18/01_intro_big_query.html),
- SQLite via Python (todo: in the backlog): link,
- PostgreSQL (todo: in the backlog): link.

```TODO: matrix/table showing prons/cons of starting with each of the solutions```
```TODO: maybe more info in general about different dbs```

Let's get started! :muscle:

### Install the database

In this section you'll see how to install the SQLite database. One of the reasons that SQLite is particurarly good starting point for learning is its simplicity. The entire database is contained in just one file. Setting it up is trivial. It requires hardly any installation and no configuration at all. It's free, so you can use it to its limits without worring about the costs.

Beware and don't let the simplicity disguise you. It is a fully reliable database with the features that you would expect from the modern SQL database. It's characteristics make it a perfect choice for:
- smartphonoes and other devices,
- websites,
- data analytics,
- and many more.

Developers of the SQLite boast that it's likely the [most popular database in the entire world](https://www.sqlite.org/mostdeployed.html). They also strive to keep it reliable until 2050. 

Truth be told, SQLite as a bit different than its more traditional brothers - fully fledged relational database management systems (RDBMS), like PostgreSQL, SQL Server or Oracle. It's not a good choice for every use case. You may decide to choose an alternative, if your project requires:
- accessing the database directly from many computers at the same time,
- simultaneous modification of the data by many users,
- working with very large amounts of data (although it officially can handle up to 281 terabytes, having to keep it in one file is the bottleneck here).

It also has some unusual behaviours that are distinct from what you will find anywhere else. Everytime we face such behaviour, you will see a warning. You can also get familiar with the full list in advance using [the official documentation](https://www.sqlite.org/quirks.html). You can see an example of the warning below.

> Important: SQLite is not a replacement for more traditional solutions. Unlike most databases it doesn't come with a dedicated server process and it's self-contained in a file. It has its own, specific problem that it's trying to solve.


There's no perfect solution but the features of SQLite make it an amazing candidate for learning. With SQLite you'll get a great "use the SQL in practice" to "try to setup the database" ratio. And remember, you're not like Neo. You can always start with one option and switch to something else in the future :shipit:

> Important: SQLite is not a replacement for more traditional solutions. Unlike most databases it doesn't come with a dedicated server process and it's self-contained in a file. It has its own, specific problem that it's trying to solve.

### Interact with your database

```TODO: now that we can interact with ..., let's discuss what are the main benefits and ...```

### Import the data

### Run your first SQL query

Now we'll go through the *SQL query* that was used to prepare the data for our report. A *query* is a piece of SQL code that you use to get the data from the database. You can also use it to do all kinds of manipulations, like: aggregations, custom calculations, filtering, sorting and so on.

One of the reasons that I'm such a huge fan of the SQL is its friendliness. SQL is a declarative language. It means that you don't need to specify all the low-lewel operations to get the desired result. You just need to describe the final outcome and the database will figure out the rest for you.

```TODO: diagram to illustrate the point above```

If you have some basic knowledge of English, you may be suprised that a simple SQL query can be nearly understandable to you.

```TODO: simple SQL```

It almost feels like a pseudocode. The cool thing is that it's an actual working query.

```TODO: explain the meaning of select, from etc.```

Let's see what happens when you run this code.

```TODO: run code and show results from db```
You can now try to execute this query inside the database managament app of your choice.

### Visualize your insights