The repository contains all the Playbooks and other files used to work with different applications for Ansible
-
Updated
Apr 4, 2023 - Python
The repository contains all the Playbooks and other files used to work with different applications for Ansible
ETH analysis using big data for the QMUL Big Data Processing module. Intended to promote analysis of data retrieved via big data processing
Third homework of CloudComputing - Fall 2022
Assignments of Big Data course during the Spring 2017 semester at Sapienza
Architected and developed a horizontally scalable data processing solution for the reddit dataset. Demonstrated the scalability (Weak Scalability and Strong Scalability) tests in suitable computational analysis.
In this project, we used both Hadoop / MapReduce and Spark to do distributed computing. The first task was to perform a series of operations using a Mapper and Reduce java file that was implemented on a Hadoop server. The second task was to perform similar operations, but on Spark instead.
Processing and transforming data via Hadoop Ecosystem
Cloud Computing Tutorials for AWS
HDFS、MapReduce、Hive、Zookeeper原理以及实践操作
Hadoop cluster on Docker (single host)
An Ansible Role to Configure and setup Hadoop Data Node.
A framework for running various failure tests against a Hadoop cluster
My work and note stuff including Hadoop & Spark ecosystem
Simple inverted indexing algorithm implemented with Hadoop
Apache Pig Latin script to count letters in multiple input text files, using the HortonWorks Hadoop Sandbox or Google Cloud Platform
Add a description, image, and links to the hadoop-cluster topic page so that developers can more easily learn about it.
To associate your repository with the hadoop-cluster topic, visit your repo's landing page and select "manage topics."