Skip to content
This repository has been archived by the owner. It is now read-only.
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
Dataset(after preprocessing)
Dataset(before preprocessing)
Big Data Project.pptx


Project Description

It is required to make use of the technologies studied during the course and lab by going through a Big Data Analytic project cycle (that you studied in lecture) for any topic/purpose or dataset you like. This project makes the total marks of the course Practical exam. The following requirements should be met:

  1. A real (i.e. published) dataset with multiple dimensions
  2. Use of data preprocessing you studied in lab (if needed)
  3. Use of different visualization aids and modalities to visualize your data
  4. Use of one analytic method you studied (Regression, K-means, Apriori, etc.)
  5. Organized and readable code

Dataset Description

  • A listing of each accidental death associated with drug overdose in Connecticut (Connecticut is the southernmost state in the New England region of the United States) from 2012 to 2018.
  • Data are derived from an investigation by the Office of the Chief Medical Examiner which includes the toxicity report, death certificate.


  1. PPT presentation showing your data visualization, analytics, conclusions, and how do you implement the Big Data project Lifecycle

  2. Dataset (.CSV) and code files (R script)

  3. A project Documentation that contains:

     a. The project main Idea
     b. The dataset and its description
     c. The data visualization and/or any analytics used
     d. The used Tools and framework
     e. Project code
     f. References for your readings and Libraries

Development Tool


Project Achievement

Predict manner of death (Accident, Pending or Natural) for one that takes drugs in New England.

You can’t perform that action at this time.