A python script to automate the import process of CSV File to a SQL server hosted in AWS RDS
This script will automatically import CSV files to a MySQL database.
Place the CSV file in the same directory and firstly install requirements by saying pip install -r requirements.txt
. Next place the csv file that needs to be imported to a MYSQL database and rename it as products.csv
..Pass the MYSQL database host end point, hostUserName, and master Password as Parameters and Run the main Script.
createTableQuery = ''' CREATE TABLE products(
`name` VARCHAR(256) NOT NULL,
`sku` VARCHAR(100) NOT NULL UNIQUE,
`description` VARCHAR(1024) DEFAULT NULL
) '''
aggregateQuery = ''' CREATE TABLE `productAggregation`
AS SELECT name,count(name) as Count from products group by name '''
-
Total number of Products = 466693 (Remaining are updated because of same 'sku' values)
-
Total Aggregation count = 212630
-
Sample 10 rows from products data:
-
Sample 10 rows from productAggregation data:
- Code Followed OOPS concepts
- Supports for updating existing products in the table based on 'sku' as the primary key.
- All product tables are ingested into a single table.
- Created an aggregated table on above rows with name and no of products as columns.
Since time is a concern, To increase the speed of processing concepts such as Multi-processing and Threading can be used.