Based-on-Hadoop-K-means-clustering-algorithm-Parallel-implementation

Project Description

Use the Hadoop platform and MapReduce programming model to complete the parallel implementation of the K-means clustering algorithm under big data, and apply it to the tourist travel city preference analysis of cities in China The project mainly consists of three parts

Big data K-means parallel clustering（hadoop_kmeans folder）
K-means parallel clustering application（application_hadoop_kmeans folder）
Tourist preference Page display（web_hadoop_kmeans folder）

Big data K-means parallel clustering

Using hadoop's mapreduce programming model, through the K-means clustering algorithm, complete the cluster analysis of 4 million pieces of data, and realize it in the following three environments:

Stand-alone
pseudo-distributed
fully distributed

K-means parallel clustering application

Use hadoop's mapreduce programming model, through the K-means clustering algorithm, and use the crawled ticket data to analyze the tourist preferences of provincial capital cities

Airline data preprocessing
Cluster analysis of all capital cities

Tourist preference Page display

Use the Python Flask framework and Baidu Map API to analyze the tourist city preferences of tourists in various cities

The more times you go to a city, the bigger the point
You can select the corresponding city through the drop-down box

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
application_hadoop_kmeans		application_hadoop_kmeans
hadoop_kmeans		hadoop_kmeans
img		img
web_hadoop_kmeans		web_hadoop_kmeans
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Based-on-Hadoop-K-means-clustering-algorithm-Parallel-implementation

Project Description

Big data K-means parallel clustering

K-means parallel clustering application

Tourist preference Page display

About

Releases

Packages

Languages

Peace1997/Based-on-Hadoop-K-means-clustering-algorithm-Parallel-implementation

Folders and files

Latest commit

History

Repository files navigation

Based-on-Hadoop-K-means-clustering-algorithm-Parallel-implementation

Project Description

Big data K-means parallel clustering

K-means parallel clustering application

Tourist preference Page display

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages