bigdata

We'll create a simple application in Java using Spark which will integrate with the Kafka topic we created earlier. The application will read the messages as posted and count the frequency of words in every message

python kafka hive hadoop bigdata spark-streaming mapreduce hortonworks hivedb

Updated Sep 5, 2022
Python

Dorianteffo / uber-eats-airflow-spark-glue-athena

Star

Ingest CSV files and load them to S3, upload Spark script to S3, run the Spark code on EMR cluster, which will pull the raw UberEats data from S3, clean the data, and load them back to S3 in the proper schema. All of this orchestrated with Airflow

python emr aws airflow spark athena bigdata glue

Updated Jan 27, 2024
Python

AyrtonAranibar / Predicci-n-de-precios-con-redes-neuronales

Star

Trabajo academico de la universidad UCSM, creamos una red neuronal multicapa para predecir precios de venta de viviendas en Argentina y Uruguay

python data-science neural-network bigdata artificial-intelligence

Updated Oct 17, 2023
Python

cdelmonte-zg / ecs-for-big-data

Star

This project aims to propose and evaluate the performance of the Entity Component System (ECS) architecture for Big Data and AI pipelines.

ai bigdata architectural-patterns

Updated Jun 3, 2024
Python

rajuranjan00 / Real-time-voting-system

Star

The code repository encompasses a real-time election voting system constructed with Python, Kafka, Spark Streaming, Postgres, and Streamlit. Docker Compose is employed to effortlessly launch the necessary services within Docker containers.

apache-spark docker-compose bigdata postgresql apache-kafka realtime-analytics streamlit-dashboard realtime-voting-system realtime-election

Updated Mar 21, 2024
Python

Mmiglio / bigDL4HEP

Star

Intel BigDL for high energy physics

machine-learning spark deep-learning physics bigdata bigdl

Updated Jan 17, 2019
Python

kleg26315 / Stockdata_Analysis

Star

주가데이터 분석

bigdata knn

Updated Jan 17, 2020
Python

levankhelo / ElasticSearch-Kibana-Py

Star

Containerized: ElasticSearch + Kibana

docker elasticsearch kibana docker-compose containers bigdata ml python3 machinelearning dockerimage containerization kibana-visualization

Updated Apr 30, 2022
Python

dsvinod90 / Project_Phase_2

Star

Loading Spotify million playlist to Mongodb, querying over previous PostgreSQL database and query optimizations for the relational db queries.

mongodb bigdata postgresql python3

Updated Mar 31, 2023
Python

Gurpreet17 / UC-Davis-SQL-for-Data-Science-Specialization

Star

Completed the SQL Basics for Data Science Specialization from the University of California, Davis, gaining proficiency in Data Analysis, SQL, Apache Spark, and Delta Lake.

data-science apache-spark sqlite bigdata data-analysis delta-lake

Updated Dec 3, 2023
Python

hatamiarash7 / elasticsearch-dump

Star

Imports raw JSON to Elasticsearch in a multi-thread way

python elasticsearch json multi-threading big-data json-data bigdata multithreading threading bulk-operation bulk-loader bulkimport bulk-inserts

Updated Apr 1, 2024
Python

Ouzrour / Ouzrour-Combolists-Filter

Star

a script that help you to easily filter all your combolists by country ( it include deleting duplicated mails adresses )

python data filter bigdata concurrency combolist

Updated Apr 28, 2023
Python

Improve this page

Add a description, image, and links to the bigdata topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the bigdata topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bigdata

Here are 335 public repositories matching this topic...

Raveesh1505 / BigData-Training

hirenpandya / Analysing-Large-Email-datasets-using-Spark-Big-Data-on-Amazon-AWS

anubhavsaxena14 / Intro-to-Hadoop-and-MapReduce

data4e / data4e

bigdatavietnam-org / data-science-101-course

owuordickson / large_gps

sameeerjadhav / IoT-Data-Processing-and-Analytics

majid0110 / Kafka-Producer-and-consumers-using-python

majid0110 / Spark-Streaming-with-Kafka-in-python

Dorianteffo / uber-eats-airflow-spark-glue-athena

AyrtonAranibar / Predicci-n-de-precios-con-redes-neuronales

cdelmonte-zg / ecs-for-big-data

rajuranjan00 / Real-time-voting-system

Mmiglio / bigDL4HEP

kleg26315 / Stockdata_Analysis

levankhelo / ElasticSearch-Kibana-Py

dsvinod90 / Project_Phase_2

Gurpreet17 / UC-Davis-SQL-for-Data-Science-Specialization

hatamiarash7 / elasticsearch-dump

Ouzrour / Ouzrour-Combolists-Filter

Improve this page

Add this topic to your repo