"Data-savvy graduate (May 2023) seeking a Data Engineer/ Analyst role to solve the company's biggest business problems"
• Scraped 24000+ products under 25+ categories from Walmart with meta data to develop a “Walmart Lens”.
• Eliminated data redundancies and skewness by pre-processing and cleaning.
• Analyzed around 25000 image input through CNN model to list all the products (<=10) recognized.
• Worked on GCP’s Vertex AI and Google Cloud Vision to generate labels and detect (<=10) texts in an image.
• Labeled the products using Amazon Rekognition Custom Label API and lambda functions.
• Web App
• Performed data wrangling and EDA on real-world data of 84,000+ building units details sold in NYC over a year
• Created quantile regression model with area- sale price and applied Recursive Feature Elimination to get top 10 features
• Designed a Random Forest Regressor model to compare the results with RFE
• Built Neural Network and improved its accuracy with Adam to improve performance.
• Developed a script to scrape tweets (10,000+) in real-time
• Analyzed tweets data to find out the impression a tweet makes based on keyword and usernames.
• Visualized the results to find out the extent of correlation between keywords and user using Seaborn & Matplotlib.
• Improved the efficiency of script to scrape tweets related to multiple keywords synchronously.
• Performed data cleaning, wrangling, outlier detection and stop word removal for some columns in the data.
• Determined relationship between categorical data using Chi Square method and built Random Forest Regressor with different number of estimators. Calculated model score (R-Squared) for each estimator by fitting X and Y.
• Formulated Open Interest per minute per strike price for any given stock, NIFTY and Bank NIFTY.
• BsScan Transaction Alert on Mail
Note : May/ May not indicate my skill level, it is just a GitHub metric of languages I have in my commits.