Skip to content

ammar1y/Data-Mining-Assignment

master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 

Data-Mining-Assignment

This repository contain the following files related to my submission of the Data Mining assignment:

  • The "Report.pdf" file is the assignment report. It describes the work done in the different stages of this project.
  • The "Notebook" folder contains a Jupyter Notebook that contains the most important steps of the project with code, results, and explanations. The "Other Notebooks" folder contains Jupyter notebooks that include additional analysis of the data.
  • The "Web crawlers" folder contains the Python scripts used for web crawling.
  • The "Stoage-in-database scripts" folder contains the Python scripts used to store crawled data in a MySQL database.

Note

To run the scripts in this repository or to run the Jupyter notebook provided, you need some data files. These files were uploaded to Dropbox because some of them are bigger than Github allows.

So to be able to run the scripts and the notebook, download the data from this link:

https://www.dropbox.com/sh/gkamebpvxwe5nqs/AAAv2EvdJJNiHxpypWmTF9uda?dl=0

and then make sure that the paths of the files in the scripts and in the notebook refer to the actual places of the files on your system. After that, you can run all scripts without a problem.

All scripts and the notebook were run and tested on a Mac device. They should run normally on Linux, but there might be some modifications needed to run them on Windows devices.

If any error happens when trying to run the scripts, please contact me on my email: ammar5656@gmail.com

The YouTube Video of the Assignment

You can watch the presentation video of this assignment by following this link:

https://youtu.be/3ctBNRdOJ3o

About

Ammar Alyousfi submission for the Data Mining assignment (UM, June 2019)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published