Emmanuel Data Scientist
Quality-driven individual delivering Business Intelligence; rich experience in, Data Science, Data Engineering, Machine Learning, Python & PySpark Development from MNCs such as Capgemini; targeting assignments in Data Science & Data Engineering in with a growth-oriented organization +91-7026710752 emmanuelanalyst@gmail.com https://www.linkedin.com/in/emmanuel-devadas-0a13b2216 https://github.com/Emmanuelmitra
. AREAS OF EXPOSURE
• Data Science • Data Engineering • Data Visualization • Machine Learning Algorithms • Statistical Modeling • Natural Language Processing • Predictive Analytics • Artificial Intelligence • Python Development • PySpark Development • Software Development
TECHNICAL SKILLS
• Web Languages/ Frameworks: SQL, Flask/ Django • Operating Systems: Windows 7, Windows 10 • Data Visualization Tools: Tableau, Matplotlib, and Seaborn • Databases: MySQL • Programming Languages: Python, PySpark, sql, and pytest • Cloud Computing and Deployment Platforms: AWS, Google Cloud, and Azure • Python ML Libraries: Sci-kit-learn, Pandas, NumPy, PySpark, NLTK, Spacy, Scipy
SOFT SKILLS
• Analytical • Numerical Competency • Problem-solving • Team Player • Time Management • Communication
CERTIFICATIONS
• Machine Learning with Python/ IBM • Data Analysis using PySpark • Data Science using Python • SQL for Data Analytics
EDUCATION
• B.E. (ECE) from BKIT, Bhalki, Belgaum in 2014
PERSONAL DETAILS
• Date of Birth: 20th February 1993 • Languages Known: English, Kannada & Hindi • Address: 1st Floor, Ashture Complex, Bhalki, 585328, Karnataka
•
PROFILE SUMMARY
• Result-oriented professional with over 2 year of rich experience in Data Science & Python Development along with knowledge of fields entailing Machine Learning, Deep Learning Models, Data Extraction, Hypothesis Testing, ANOVA, ROC; possess industry-specific knowledge such as Healthcare, Finance and Retail • Possess strong understanding of Data Analysis and Visualization Tool, Cloud Computing and Deployment Platforms, Operating Systems, Web Languages/ Frameworks • Identified, developed and implemented appropriate statistical techniques, algorithms, and Deep Learning/ ML Models, hypothesis testing, correlation, and regression, as well as used a confusion matrix to create new, scalable solutions that resolved business challenges • Proficiency in supervised and unsupervised Machine Learning techniques, including regression, classification, clustering, text mining, bagging, and boosting • Hold Certification in Machine Learning, Data Science & Data Analytics; proficient in programming languages such as Python, PySpark, SQL • Currently upskilling under certification course named Full-stack Data Science from iNeuron • A focused individual with a zeal to learn and adapt to new technologies quickly; capabilities in managing critical situations
NOTABLE ACCOMPLISHMENTS ACROSS THE CAREER
• Received “Rising Star” Award for outstanding performance at Capgemini in 2022 for delivering the project within the stipulated time • Effectively reduced the completion time of the project by 2 months using common functions and code analyzation to frame solutions • Achieved the highest performance rating at Capgemini for best performance throughout the year • Rendered proactive assistance in the development of a Data Science Framework, named PySpark designed to handle large volume data sets
WORK EXPERIENCE
Feb’22 – Present | Data Scientist | Capgemini, Bengaluru Key Result Areas: • Collaborating with multi-disciplinary teams to understand business requirements for the products and projects • Analyzing statistical data; importing data from multiple sources, joining datasets, cleaning & preprocessing, data wrangling and visualizations • Identifying and implementing statistical models including linear models, multivariate analysis, stochastic models, sampling, optimization and time series analysis • Diagnosing bugs and data flow issues - reviewing old code to ensure that data is flowing correctly and fixing when the issues arise • Employing Machine Learning algorithms for classification (SVM, decision trees, random forest, and naïve Bayes), regression (linear and logistic), and clustering (k-means and hierarchical)
Projects Name: Market Stress Testing Platform Project Description: To build a stress testing platform to evaluate the resilience of financial portfolios to market variations using PySpark, SQL & Python • Employed Machine Learning and statistical models to simulate market conditions and assess portfolio risk; optimized the performance of the platform by fine-tuning the model hyperparameters • Evaluated the platform performance and making improvements; ensuring delivering of valuable insights into portfolio risk along with aided financial institutions make informed investment decisions under varying market conditions
Project Name: Named Entity Recognition from Medical Journals Project Description: It is very time consuming to go through all the medical articles and papers and find out what are the entities used in them to make it easier. Aim was to build a model based on transformer to summarize entities in tabular form for better analysis using Spacy, Med7 trained model • Applied Machine Learning and deployment of models • Constructed data sets by understanding the purpose of the data and the story it is going to tell
INTERNSHIP
Sep’21 – Feb’22 | Data Science Intern | Innodatatics, Hyderabad Key Result Areas: • Coordinated with Development Teams to determine application requirements • Drafted scalable code using Python programming language; developed back-end components
Project Managed: Predictive Healthcare Analytics Platform Project Description: To build Predictive Analytics Platform to assess the likelihood of disease relapse in TNBC patients using PySpark, Python & SQL • Implemented Machine Learning and statistical models to predict disease relapse • Employed fine-tuning of the model hyperparameters; optimized and evaluated the same