In [None]:
# exercise 01

"""
Create a role

A database role is an entity that contains information that define the role's privileges and interact with the client authentication system. Roles allow you to give different people (and often groups of people) that interact with your data different levels of access.

Imagine you founded a startup. You are about to hire a group of data scientists. You also hired someone named Marta who needs to be able to login to your database. You're also about to hire a database administrator. In this exercise, you will create these roles.
"""

# Instructions

"""
Create a role called data_scientist.
---
Create a role called marta that has one attribute: the ability to login (LOGIN).
---
Create a role called admin with the ability to create databases (CREATEDB) and to create roles (CREATEROLE).
"""

# solution

-- Create a data scientist role
CREATE ROLE data_scientist;

#----------------------------------#

-- Create a role for Marta
CREATE ROLE marta LOGIN;

#----------------------------------#

-- Create an admin role
CREATE ROLE admin WITH CREATEDB CREATEROLE;

#----------------------------------#

# Conclusion

"""
Nice work! You created a group role, data_scientist, that you can populate later with whatever access level you deem appropriate. Marta can login. The admin, whoever holds that role, has the ability to create databases and manage roles. You now know how to create roles to specify different levels of access for individuals and groups of individuals, which is good database management practice.
"""

'/home/nero/Documents/Estudos/DataCamp'

In [1]:
# exercise 02

"""
GRANT privileges and ALTER attributes

Once roles are created, you grant them specific access control privileges on objects, like tables and views. Common privileges being SELECT, INSERT, UPDATE, etc.

Imagine you're a cofounder of that startup and you want all of your data scientists to be able to update and insert data in the long_reviews view. In this exercise, you will enable those soon-to-be-hired data scientists by granting their role (data_scientist) those privileges. Also, you'll give Marta's role a password.
"""

# Instructions

"""

    Grant the data_scientist role update and insert privileges on the long_reviews view.

    Alter Marta's role to give her the provided password.

"""

# solution

-- Grant data_scientist update and insert privileges
GRANT update, insert ON long_reviews TO data_scientist;

-- Give Marta's role a password
ALTER ROLE marta WITH PASSWORD 's3cur3p@ssw0rd';

#----------------------------------#

# Conclusion

"""
Cool! Everyone in the data_scientist role (which is currently no one, though you're hiring shortly) is now able to update data and insert data in the long_reviews view. This view has business-critical data that's updated often so these privileges are key to your startup's success. Marta is happy because she has a password now, too!
"""

'\n\n'

In [2]:
# exercise 03

"""
Add a user role to a group role

There are two types of roles: user roles and group roles. By assigning a user role to a group role, a database administrator can add complicated levels of access to their databases with one simple command.

For your startup, your search for data scientist hires is taking longer than expected. Fortunately, it turns out that Marta, your recent hire, has previous data science experience and she's willing to chip in the interim. In this exercise, you'll add Marta's user role to the data scientist group role. You'll then remove her after you complete your hiring process.
"""

# Instructions

"""

    
    Add Marta's user role to the data scientist group role.
    
    Celebrate! You hired multiple data scientists.
    
    Remove Marta's user role from the data scientist group role.

"""

# solution

-- Add Marta to the data scientist group
GRANT data_scientist TO marta;

-- Celebrate! You hired data scientists.

-- Remove Marta from the data scientist group
REVOKE data_scientist FROM marta;

#----------------------------------#

# Conclusion

"""
Bless you, Marta! She really helped the company out in a pinch. And it wasn't difficult for you to set her up with appropriate access to company data thanks to the roles you previously created!
"""

'\n\n'

# Reasons to partition

In the video, you saw some very good reasons to use partitioning. However, can you find which one wouldn't be a good reason to use partitioning?

### Possible Answers


    Improve data integrity {Answer}
    
    
    Save records from 2017 or earlier on a slower medium
    
    
    Easily extend partitioning to sharding, and thus making use of parallelization

**Exactly! That's not something you'd use partitioning for.**

# Partitioning and normalization

In the video, you saw the differences between the two types of partitioning: vertical and horizontal partitioning. As you'd expect, the names suggest how these different strategies work.

It might be a bit challenging to distinguish normalization, which you saw in previous chapters, from partitioning.

![PARTITIONING](/home/nero/Documents/Estudos/DataCamp/SQL/courses/database-design/partitioning.png)

**That's right! Partitioning is related to the physical data model. It does not change the logical data model, while normalization does.**

In [3]:
# exercise 04

"""
Creating vertical partitions

In the video, you learned about vertical partitioning and saw an example.

For vertical partitioning, there is no specific syntax in PostgreSQL. You have to create a new table with particular columns and copy the data there. Afterward, you can drop the columns you want in the separate partition. If you need to access the full table, you can do so by using a JOIN clause.

In this exercise and the next one, you'll be working with the example database called pagila. It's a database that is often used to showcase PostgreSQL features. The database contains several tables. We'll be working with the film table. In this exercise, we'll use the following columns:

    film_id: the unique identifier of the film
    long_description: a lengthy description of the film

"""

# Instructions

"""

    Create a new table film_descriptions containing 2 fields: film_id, which is of type INT, and long_description, which is of type TEXT.
   
    Occupy the new table with values from the film table.
--

    Drop the field long_description from the film table.
    Join the two resulting tables to view the original table.

"""

# solution

-- Create a new table called film_descriptions
CREATE TABLE film_descriptions (
    film_id INT,
    long_description TEXT
);

-- Copy the descriptions from the film table
INSERT INTO film_descriptions
SELECT film_id, long_description FROM film;
    
-- Drop the descriptions from the original table
ALTER TABLE film DROP COLUMN long_description;

-- Join to view the original table
SELECT * FROM film
JOIN film_descriptions USING(film_id);

#----------------------------------#

# Conclusion

"""
That's it! Now you know how to CREATE, INSERT and ALTER statements!
"""

'\n\n'

In [4]:
# exercise 05

"""
Creating horizontal partitions

In the video, you also learned about horizontal partitioning.

The example of horizontal partitioning showed the syntax necessary to create horizontal partitions in PostgreSQL. If you need a reminder, you can have a look at the slides.

In this exercise, however, you'll be using a list partition instead of a range partition. For list partitions, you form partitions by checking whether the partition key is in a list of values or not.

To do this, we partition by LIST instead of RANGE. When creating the partitions, you should check if the values are IN a list of values.

We'll be using the following columns in this exercise:

    film_id: the unique identifier of the film
    title: the title of the film
    release_year: the year it's released

"""

# Instructions

"""

    Create the table film_partitioned, partitioned on the field release_year.
---

    Create three partitions: one for each release year: 2017, 2018, and 2019. Call the partition for 2019 film_2019, etc.
---

    Occupy the new table, film_partitioned, with the three fields required from the film table.

"""

# solution

-- Create a new table called film_partitioned
CREATE TABLE film_partitioned (
  film_id INT,
  title TEXT NOT NULL,
  release_year TEXT
)
PARTITION BY LIST (release_year);

-- Create the partitions for 2019, 2018, and 2017
CREATE TABLE film_2019
	PARTITION OF film_partitioned FOR VALUES IN ('2019');

CREATE TABLE film_2018
	PARTITION OF film_partitioned FOR VALUES IN ('2018');

CREATE TABLE film_2017
	PARTITION OF film_partitioned FOR VALUES IN ('2017');

-- Insert the data into film_partitioned
INSERT INTO film_partitioned
SELECT film_id, title, release_year FROM film;

-- View film_partitioned
SELECT * FROM film_partitioned;

#----------------------------------#

# Conclusion

"""
Great! As you can see, the data is not changed in the partitioned table. However, you might notice PostgreSQL orders the partitioned table differently by default.
"""

'\n\n'

# Data integration do's and dont's

You just learned a lot about data integration, let's check your understanding of the concepts.

![DATA_INTEGRATION](/home/nero/Documents/Estudos/DataCamp/SQL/courses/database-design/data_integration.png)

**Great work. Looks like you have a good grasp on what you should take into account when looking at and can expect from a good data integration solution.**

# Analyzing a data integration plan

You're a data analyst in a hospital that wants to make sure there is enough cough medicine should an epidemic break out. For this, you need to combine the historical health records with the upcoming appointments to see if you can detect a pattern similar to the last cold epidemic. Then, you need to make sure there is sufficient stock available or if the stock should be increased. To help tackle this problem, you created a data integration plan.

Which risk is not clearly indicated on the data integration plan?

![DATA_PLAN](/home/nero/Documents/Estudos/DataCamp/SQL/courses/database-design/DIex-Hospital_example.jpeg)

### Possible Answers


    It is unclear if you took data governance into account.
    
    
    You didn't clearly show where your data originated from.
    
    
    You should indicate that you plan to anonymize patient health records.{Answer}
    
    
    If data is lost during ETL you will not find out.

**Correct! When working with sensitive data it is important to think about permissions. By default you should have the same access rights before and after data integration. If part of the data is essential, it should be anonymized, in this case you can keep the illnesses but remove identifying information.**

# SQL versus NoSQL

Deciding when to use a SQL versus NoSQL DBMS depends on the kind of information you’re storing and the best way to store it. Both types store data, they just store data differently.

When is it better to use a SQL DBMS?

### Possible Answers


    You are dealing with rapidly evolving features, functions, data types, and it’s difficult to predict how the application will grow over time.
    
    
    You have a lot of data, many different data types, and your data needs will only grow over time.
    
    
    You are concerned about data consistency and 100% data integrity is your top goal.{Answer}
    
    
    Your data needs scale up, out, and down.

**Perfect! The strength of SQL DBMSs lies in using integrity constraints to maintain data consistency across multiple tables.**

# Choosing the right DBMS

As you saw in the video, there are lots of different options when choosing a DBMS. The choice depends on the business need. In this exercise, you are given a list of cards describing different scenarios and it's your job to pick the DBMS type that fits the project best. Remember the different DBMS types:

    SQL: RDBMS
    NoSQL: key-value store, document store, columnar database, graph database
![DBMS](/home/nero/Documents/Estudos/DataCamp/SQL/courses/database-design/DMBS.png)

**Great work! As you can see there are many different DBMS types and you need to carefully consider the business needs before making your decision.**