I take a dataset contained in a CSV file (called movies.csv), of movie information, clean it and turn it into a nice, normalized set of tables.
Initial movie table
Normalized set of tables
- [X] Remove special characters from columns **movies** & **year**
- [X] Set null values in empty rows
- [X] Trim spaces and remove newlines from columns
- [X] Remove multivalues
- [X] Remove duplicate values
- [X] Find Functional Dependencies
- [X] Decompose Tables
- [X] Set surrogate keys
- [X] Check for lossless joins
- aggregate functions
- window functions
- views
- joins
- unions
- unnest()
- replace()
- substring()
- trim()
- nullif()
- regexp_replace()
- left()
- right()
- string_to_array()
- cast()
- Clone repository
$ git clone https://github.com/AposLaz/POSTGRESQL_NORMALIZATION.git
$ cd POSTGRESQL_NORMALIZATION
# Remove current origin repo
$ git remote remove origin
- Docker
$ docker-compose up
#then you have to configure pg_admin
$ localhost:5050
$ username: admin@admin.com
$ password: root
#server
$ host: pg_container
$ username: root
$ password: root