Business-Intelligence-on-Big-Data-_-U-TAD-2017-Big-Data-Master-Final-Project

Business Intelligence on Big Data _ U-TAD 2017 Big Data Master Final Project.

This is the final project I had to do to finish my Big Data Expert Program in U-TAD in September 2017. It uses the following technologies: Apache Spark v2.2.0, Python v2.7.3, Jupyter Notebook (PySpark), HDFS, Hive, Cloudera Impala, Cloudera HUE and Tableau.

Big Data Projects can be classified into two main groups: operative and research projects. Operative projects are the ones currently being done using the traditional tools but using Big Data technologies to carry them out on more data, faster, and spending less money. These are the projects carried out by companies that are entering the Big Data world. Business Intelligence, the capability to turn data into information, and information into knowledge, so that the decision-making process in businesses can be optimized, may be one example of these operations improvement projects.

I have used meteorological data from the National Oceanic and Atmospheric Administration NOAA of the United States; and have transformed these text csv files using Spark (programming in Python using Jupyter IDE) into a Dataframe table readable by Hadoop Hive to be used as a Datawarehouse table. Using Cloudera HUE, a Datamart Hive table will be created with the Maximum Daily Temperatures of all countries along 2016. Finally, using a ‘select’ clause in Impala commanded from BI SW Visualization Tableau, the temperatures Datamart table will be downloaded, so that Line Graphs showing the Maximum 2016 Daily World Temperatures filtered by country can be painted.

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
con01		con01
datos meteorologicos comienzo proyecto		datos meteorologicos comienzo proyecto
logos		logos
mapReduce 20170712		mapReduce 20170712
mapReducePruebaMailJconca 20170715		mapReducePruebaMailJconca 20170715
sparkTFM 20170715		sparkTFM 20170715
20170910 PedroTobarra_UTAD_TrabajoFinProgramaExperto.docx		20170910 PedroTobarra_UTAD_TrabajoFinProgramaExperto.docx
20170910 PedroTobarra_UTAD_TrabajoFinProgramaExperto.pdf		20170910 PedroTobarra_UTAD_TrabajoFinProgramaExperto.pdf
Asignación - PEBD_1609 - 20170521.pdf		Asignación - PEBD_1609 - 20170521.pdf
Catalogo de proyectos 20170512.pdf		Catalogo de proyectos 20170512.pdf
Normativa Trabajos Fin de Experto 20170512.pdf		Normativa Trabajos Fin de Experto 20170512.pdf
Petición proyecto Big Data 20170512.msg		Petición proyecto Big Data 20170512.msg
README.md		README.md
TrabajoFindeExperto 20170520.PNG		TrabajoFindeExperto 20170520.PNG
apuntes Spark 20170720.txt		apuntes Spark 20170720.txt
comandos_python_spark_20170721.txt		comandos_python_spark_20170721.txt

ptobarra/Business-Intelligence-on-Big-Data-_-U-TAD-2017-Big-Data-Master-Final-Project

Folders and files

Latest commit

History

Repository files navigation

Business-Intelligence-on-Big-Data-_-U-TAD-2017-Big-Data-Master-Final-Project

About

Topics

Resources

Stars

Watchers

Forks

Languages