Skip to content

jonxsong/DSC180AB-Capstone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prediction Task: Utilizing CPU Statistics and Application Usage to Predict a User’s Persona

Homepage

https://vlw003.github.io

Medium Blog

https://predicting-persona-b09group04.medium.com/

Usage

git clone https://github.com/jonxsong/DSC180AB-Capstone.git
cd DSC180AB-Capstone
python run.py test

Files

./config/data-params.json - directory where data should be output to

./config/hw-metric-histo-data-params.json - description of the dataset and features we utilize

./config/systems-sysinfo-unique-normalized-data-params.json - description of the dataset and features we utilize

./config/ucsd-apps-execlass-data-params.json - description of the dataset and features we utilize

./config/frgnd_backgrnd_apps-data-params.json - description of the dataset and features we utilize

./notebooks/eda.ipynb - notebook containing data explorations from DSC180B

./notebooks/dsc180a-notebook.ipynb - notebook containing data explorations from DSC180A

./src/data_exploration.py - file containing relevant methods for data exploration

./src/model.py - file containing relevant methods for data modelling

./requirements.txt - required packages

./run.py - call run.py to run data analysis

Data/Output Files

./data/out/... - this location should hold all the outputted pictures generated from methods

./data/raw/... - this location should hold all the datasets downloaded below

Link to download the datasets:

https://drive.google.com/drive/folders/1nNpwhzrbKUJd0ZwbCYLGQH49CKkKLTQ4?usp=sharing

The datasets we are using are too large for github. The datasets should be stored in /data/raw/.

Sources

Link to Project Report: https://docs.google.com/document/d/1IpWfuG2IxurT5LOMyudWpn3UOLsKYKdjbbwqNhPGlYk/edit?usp=sharing

Responsibilities:

Jon: - Report + main ideas - data analysis - code breakdown - repository structuring - notebook outlining - script writing

Vince: - data modeling - Report + targets - data cleaning - data explorations - classifications - Visual Presentation Checkpoint - Website - Final Report - Slides

Keshan: - data preparation - tabled data - key notes all throughout notebook - graphs + graph analysis - ATL work

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published