Google Colab Notebook to Extract, Transform, & Load a big data set to an AWS RDS instance. Used pgAdmin to set up PostgreSQL database tables and make SQL queries to the database.
-
Updated
Nov 28, 2020
Google Colab Notebook to Extract, Transform, & Load a big data set to an AWS RDS instance. Used pgAdmin to set up PostgreSQL database tables and make SQL queries to the database.
Analyze whether reviews from Amazon's Vine program are trustworthy by using PySpark on Google Colab.
Analyze a data set from the Amazon Vine program to determine if there is any bias toward favorable reviews from Vine members in your dataset.
Used MapReduce, (NLP) in relation to big data, Google Colab, AWS , Visual Studio Code to Analyze Amazon Vine Reviews
Project to analyze reviews written as part of the Amazon Vine program. PySpark and Python are used to create dataframes and clean the raw data. Cleaned data is loaded into an AWS RDL for analysis. The relationships between paid and free reviews were then investigated.
Took customer reviews dataset from AWS S3 and processed the data with PySpark using google colab ,the cleaned dataset is stored in appropriate tables in AWS RDS through pgAdmin.
Using PySpark to perform ETL on video games review dataset from Amazon Vine program; Set up connection between Amazon RDS and Postgres; Determine bias from the dataset using PySpark
An Analysis of Amazon Vine reviews via pgAdmin and PySpark to compare value of reviews compared to non-Amazone vine reviews.
Investigated whether Vine reviews are free of bias using SQL and ETL skills to analyze the data.
The purpose of the project is to analyze Amazon reviews written by members of the paid Amazon Vine program using AWS, PySpark, and SQL.
Analysis on Amazon's vine review program using PySpark and AWS RDS with PostgreSQL
Add a description, image, and links to the vine-reviews topic page so that developers can more easily learn about it.
To associate your repository with the vine-reviews topic, visit your repo's landing page and select "manage topics."