Skip to content

My task is to take a dataset contained in a CSV file, of movie information, to clean it and to turn it into a nice, normalized set of tables.

Notifications You must be signed in to change notification settings

AposLaz/POSTGRESQL_NORMALIZATION

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Database Normalization

I take a dataset contained in a CSV file (called movies.csv), of movie information, clean it and turn it into a nice, normalized set of tables.

image image

Tables

Initial movie table

initial_db drawio

Normalized set of tables

sql_er_diagram drawio

Steps for normalization

- [X] Remove special characters from columns **movies** & **year**
- [X] Set null values in empty rows
- [X] Trim spaces and remove newlines from columns
- [X] Remove multivalues
- [X] Remove duplicate values
- [X] Find Functional Dependencies
- [X] Decompose Tables
- [X] Set surrogate keys 
- [X] Check for lossless joins

Functions & Techniques

  • aggregate functions
  • window functions
  • views
  • joins
  • unions
  • unnest()
  • replace()
  • substring()
  • trim()
  • nullif()
  • regexp_replace()
  • left()
  • right()
  • string_to_array()
  • cast()

Getting started

  1. Clone repository
	$ git clone https://github.com/AposLaz/POSTGRESQL_NORMALIZATION.git
		
	$ cd POSTGRESQL_NORMALIZATION

	# Remove current origin repo
	$ git remote remove origin  
  1. Docker
	$ docker-compose up

	#then you have to configure pg_admin
	$ localhost:5050
	$ username: admin@admin.com
	$ password: root

	#server
	$ host: pg_container
	$ username: root
	$ password: root 

About

My task is to take a dataset contained in a CSV file, of movie information, to clean it and to turn it into a nice, normalized set of tables.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published