Skip to content
Classification and clustering data of hotels reservation to produce conclusions and decsions
Jupyter Notebook
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.ipynb_checkpoints [IMPROVE] calculate dendogram using toPandas() Mar 13, 2019
metastore_db more on clustering Mar 2, 2019
Clustering.ipynb
DataSetTemplate.ipynb
FinalProject.ipynb update explanations Mar 7, 2019
Hotel Data Science add extra feature Mar 7, 2019
Hotels_data_Changed.csv required features addition Dec 22, 2018
Hotels_data_for_best_discount_code.csv
README.md [DOC] add instructions for working with spark using docker Jan 10, 2019
clustering_in_spark.ipynb [IMPROVE] calculate dendogram using toPandas() Mar 13, 2019
derby.log more on clustering Mar 2, 2019
hotels_data.csv required features addition Dec 22, 2018

README.md

Hotels-Data-Science

Classification and clustering data of hotels reservation to produce conclusions and decsions

Spark using Docker

  1. install cifs using:
sudo apt-get install cifs-utils
  1. mount a remote notebook using:
sudo mount -t cifs -o username<your_username>,password=<your_password>,uid=<your_user_id>,gid=<your_group_id> //<host_ip>/<your_path> /mnt/<your_directory>
  1. install Docker using
sudo apt update
sudo apt install apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt-get update
sudo apt-get install docker-ce
  1. run spark container using:
docker run --name pyspark --user root -p 8888:8888 -d -e NB_USER=<your_user> -e CHOWN_HOME=yes -e CHOWN_HOME_OPTS='-R' -e GRANT_SUDO=yes -v /mnt/<your_path>:/home/<your_user>/work -w /home/$NB_USER jupyter/pyspark-notebook
You can’t perform that action at this time.