Skip to content

Aggregated Capstone and Certificate Projects in Data Analysis and Predictive Data Models

Notifications You must be signed in to change notification settings

craigtrupp/data_analysis_projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 

Repository files navigation

💹 Data Analysis & Modeling Repository Overview 💹

Within this repository is a collection of materials used in completing capstone projects and accredited certification in Data Analysis and Modeling. Please see below for the the principle areas of focus that highlight the key concepts, skills, and topics covered.

Principal Areas of Focus 🔦

  • Statistical Experimentation
  • Data Management/Validation
  • Data Wrangling
  • Product Sales Analysis
    • Most Advantageous Sales Method Types
    • Density Revenue Distribution by Sales Method
  • Revenue Distribution
    • Revenue Distirubtion by Sales Methods
    • Revenue Distribution Overall Time Intervals
  • Customer Insights
    • Customer distribution by sales_method applied to customer
    • Customer Website visits correlation to revenue
    • Customer distribution by years_as_customer
  • Binning Customer Activities by Categorical Features
  • Key Business Metric Identification
  • Predictive Website Traffic Analysis
  • Principal Component Analysis & Feature Selection
  • Multiple Classification Model Generation & Accuracy Check

Tools Used 🧰

  • SQL
  • Python
  • Pandas
  • Seaborn
  • Matplotlib
  • Sklearn
    • StandardScaler
    • Logistic Regression
    • SVC
    • PCA
    • Model_Selection : (KFold, train_test_split, cross_val_score)
  • Beautiful Soup (Web Scraping)
  • Requests (API Library)
  • IBM Cognos Dashboard

Project Details & Certificate Issuers 🏗️

IBM Data Analyst

DAnalyst Cert

IBM provides a comprehensive learning path available through Coursera with a culminating capstone project which will be linked to below. Throughout the program, the curriculum offers labs, tests, exercises and ultimately the capstone project to bring together skills and tools that represent the most up-to-date practical skills and tools that data analysts use in their daily roles. Such ares of focus included

  • Spreadsheets, Microsoft Excel
  • Python Programming, Pandas, Numpy, Matplotlib
  • SQL, Dashboard and BI Tools such as IBM Cognos Analytics
  • Presentation Generation for Business Metrics and Stakeholder Review

Capstone : IBM 🪨

  • Capstone Section Details
    • Section I : Data Collection, Web Scraping, Data Exploration
    • Section II : Data Wrangling (Duplicate Identification & Removal, Normalize and Impute Missing Values)
    • Section III : Data Analysis (Distribution of Data, Correlation Between Features, Outlier Detection and Removal)
    • Section IV : Data Visualization (Visualize Composition & Comparison of Data, Distribution of Data, Realtionship Between Features )
  • Cognos Dashboard
    • Project Highlights
      • Technology Usages by survery respondents for Programming Languages, Platforms, WebFrames
      • Future Technology Trend for survery respondents desire for ongoing skill/education development with respect to the same topics above
      • Demographics for survey respondents by Gender, Country, Age, & Education
    • Contains three sheets with differing visualization types
      • Bubble & Word Cloud Charts
      • Vertical & Horizontal Bar Charts
      • Pie & Line Charts
      • Geographic Density Map
    • Dashboard Link : IBM Cognos Project Dashboard
      • Github does include the _target attribute and thus this link will lead away from the repository listing in your current browser. Please use CTRL+click - PC or CMD+click - Mac to open this resource in a new tab
  • Presentation Slides with Data Generated from Previous Sections

Additional Section Learning Material



DataCamp

Dcamp DataCamp offers various levels of certification for users of the platform to have their technical and reporting accumen put to the test to prove your skills are job-ready with Associate and Professional level certifications. DataCamp's coding challenges are free form, where candidates are presented with certain data but it is up to them to come up with an appropriate solution. The goal of this task is to demonstrate that the individual has the ability to perform the tasks required of them as a data analyst or scientist without being guided towards the appropriate solution.

Coding Challenges

Please visit the area's individual markdown file which contains details on the Exams administered by DataCamp for their certification requirements.

For Further Details on DataCamp's Certification Process


About

Aggregated Capstone and Certificate Projects in Data Analysis and Predictive Data Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published