#

hadoop-ecosystem

Here are 40 public repositories matching this topic...

dhkdn9192 / data_engineer_career

DE직무에 필요한 모든 것

interview-questions data-engineer hadoop-ecosystem

Updated May 24, 2024
Jupyter Notebook

ArwaEiad / TMDB-Project

This project focuses on analyzing movie data using Pyspark tailored for efficient data processing on Hadoop Distributed File System (HDFS)

pyspark hdfs hadoop-ecosystem

Updated May 6, 2024
Jupyter Notebook

madd86 / awesome-system-design

A curated list of awesome System Design (A.K.A. Distributed Systems) resources.

distributed-systems microservices nosql interview stream-processing microservices-architecture relational-database message-broker hadoop-ecosystem

Updated Mar 26, 2024

tingjhenjiang / bigdata_docker_images

資料平行批次與串流處理以及搭建機器學習環境會用到的container

dockerfile spark jupyterhub hadoop-ecosystem

Updated Mar 27, 2023
Dockerfile

ZuInnoTe / hadoopoffice

HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)

spark hive hadoop excel bigdata office poi flink hadoop-ecosystem hadoopoffice analyze-office-documents

Updated Oct 29, 2022
Java

pfisterer / apache-knox-helm

Helm chart for Apache Knox

hadoop apache helm-charts yaml-configuration knox hadoop-ecosystem apache-knox

Updated Sep 28, 2022
Mustache

meliodaseren / mapreduce-demo

Hadoop MapReduce

java hadoop mapreduce hadoop-ecosystem

Updated Aug 21, 2022
Java

Cigna / ibis

IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.

workflow hadoop ingestion oozie sqoop sqoop2 workflow-automation workflow-scheduler hadoop-ecosystem hadoop-framework ibis cigna

Updated Apr 13, 2022
Python

simple-learning / Hadoop

Hadoop Projects

hadoop java-8 hadoop-mapreduce hadoop-streaming hadoop-ecosystem hadoop-testing hadoop-mrunit mrunit

Updated Apr 12, 2022
Java

pfisterer / apache-knox-docker

Dockerfile for running Apache Knox (http://knox.apache.org/) in Docker

dockerfile hadoop rest-api hadoop-cluster hadoop-ecosystem apache-knox gateway-server

Updated Mar 21, 2022
Dockerfile

SarahAyaz / YouTube_Data_Analysis

Analysis of YouTube Data using Hadoop Mapreduce framework in Java.

java linux youtube hadoop analysis hdfs mapreduce hadoop-filesystem hadoop-mapreduce hadoop-ecosystem mapreduce-java hadoop-hdfs partitioner

Updated Jan 30, 2022
Java

jodth07 / hadoop-installation

Instructions on setting up Hadoop, HDFS, java, sbt, kafka, scala, spark and flume on Ubuntu 18.04

scala kafka spark hadoop sbt installation flume kafka-installation hadoop-ecosystem hadoop-installation hadoop-hdfs spark-installation sbt-installation scala-installation

Updated Jul 17, 2021
Shell

saitejavishalj / Hotspot-analysis-of-Geospatial-data

Built a Large Scale Distributed Data Processing system for Streaming Analytics using Hadoop Ecosystem (Apache Spark and HDFS), in Cloud for real-time spatial analytics.

distributed-systems apache-spark hdfs data-analysis sparksql large-scale hadoop-ecosystem streaming-analytics apache-hadoop

Updated Jun 4, 2021
Scala

Rohit-Jain-2801 / HadoopInstallGuide

Apache Hadoop Components Installation Guide on Windows

java windows hive hadoop apache hbase pig hdfs installation hadoop-ecosystem apache-pig apache-hive installation-guide apache-hbase

Updated May 2, 2021

AnkitaSinha98 / Customer360-Data-Analysis

Big Data is Stored and analyzed of various Customer using Hadoop and other tools like Hive, Zookeeper, Hbase and sqoop and all details of the customer is analyzed then result are given.This result is very useful for companies.

hive hadoop hbase zookeeper dataset pig sqoop hadoop-ecosystem big-data-analytics

Updated Feb 10, 2021

PrathameshNimkar / Big-Data-Analysis-using-the-Hadoop-Ecosystem

Learn and implement the Hadoop Ecosystem to drive Big Data Analytics.

big-data cloudera tutorials cloudera-manager hadoop-ecosystem big-data-analytics

Updated Dec 30, 2020

vineetdcunha / Hadoop_Ecosystem

Processing and transforming data via Hadoop Ecosystem

python hive hadoop python-script hbase pyspark mahout pig hadoop-cluster hadoop-mapreduce hadoop-streaming hadoop-ecosystem hiveql multinode hadoop-hdfs hbase-standalone

Updated Nov 26, 2020
Python

DiegoBulhoes / hadoop-ansible-single-node

Ambiente com o objetivo de praticar o uso das ferramentas Ansible e Hadoop usando uma única instância

ansible vagrant hadoop hadoop-ecosystem single-node

Updated Jun 29, 2020
Shell

reggert / cumulative

[Work in progress] Client library for simplified access to Apache Accumulo

scala spark bigdata accumulo hadoop-ecosystem

Updated May 7, 2020
Scala

m-r-tanha / Hadoop-Ecosystem

This repository is going to update based on my challenges in installing and using the Hadoop's tools Spark

hadoop-ecosystem

Updated Mar 29, 2020

Improve this page

Add a description, image, and links to the hadoop-ecosystem topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hadoop-ecosystem topic, visit your repo's landing page and select "manage topics."