Author: Ayman Salama Contact: ayman3salama@gmail.com Python scripts for testing purposes
Note: Each folder contains:
- Python Script
- The data file the is used by python script. The data files sometimes are zipped because of the size.
- Results folder contains the output of the script. File names are descriptive as per the sent the document.Some results are zipped beacuse of the size.
The code are solving problems such as:
- Get the Median of each product, Megre the Median result with the product ID
- Get the Mean, Min and Max of each product, Megre the Mean result with the product ID
- Get the Best Performing Peoduct (Based on volume)
- Identify the most promising product using regression analysis
- Identify the top 5 worst performing products on a biweekly basis
- Identify outliers from the data and output the corresponding week numbers using normal distribution
- Using NLP to extract information from text like tile, duration,location, discription ..etc giving the incosistency in data.
- Deal with Null vs N/A values in data.
- Perform several text processing to extract information
- Store the data in Dataframe and SQL and S3 Bucket
- Get the similarity between several discription