Skip to content
@cerndb

CERN Database and Analytics Group

Popular repositories Loading

  1. dist-keras dist-keras Public archive

    Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.

    Python 622 167

  2. spark-dashboard spark-dashboard Public

    Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an Apache Spark Performance Dashboard using containers technology.

    Dockerfile 120 21

  3. SparkPlugins SparkPlugins Public

    Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are initialized. This also allows extending the Spark metrics syst…

    Scala 88 15

  4. hdfs-metadata hdfs-metadata Public

    Tool for gathering blocks and replicas meta data from HDFS. It also builds a heat map showing how replicas are distributed along disks and nodes.

    Java 56 19

  5. grafana-mimir-cardinality-dashboards grafana-mimir-cardinality-dashboards Public

    Grafana Mimir dashboards used for cardinality exploration

    40 7

  6. SparkDLTrigger SparkDLTrigger Public

    Code and links to the data for the article "Machine Learning Pipelines with Modern Big DataTools for High Energy Physics"

    Jupyter Notebook 29 14

Repositories

Showing 10 of 66 repositories
  • SparkTraining Public

    Material for the course "Introduction to Apache Spark APIs for Data Processing" https://sparktraining.web.cern.ch/

    Jupyter Notebook 12 CC-BY-4.0 6 0 0 Updated Mar 12, 2025
  • spark-dashboard Public

    Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an Apache Spark Performance Dashboard using containers technology.

    Dockerfile 120 Apache-2.0 21 1 0 Updated Feb 28, 2025
  • opentelemetry-collector-contrib Public Forked from open-telemetry/opentelemetry-collector-contrib

    Contrib repository for the OpenTelemetry Collector

    Go 0 Apache-2.0 2,634 0 0 Updated Feb 22, 2025
  • grafana-mimir-cardinality-dashboards Public

    Grafana Mimir dashboards used for cardinality exploration

    40 Apache-2.0 7 4 0 Updated Jan 15, 2025
  • hadoop-xrootd Public

    Mirror of CERN db/hadoop-xrootd. Hadoop-XRootD Filesystem Connector

    Java 6 Apache-2.0 3 3 1 Updated Sep 25, 2024
  • SparkDLTrigger Public

    Code and links to the data for the article "Machine Learning Pipelines with Modern Big DataTools for High Energy Physics"

    Jupyter Notebook 29 Apache-2.0 14 0 0 Updated Jun 11, 2024
  • argo-helm Public Forked from argoproj/argo-helm

    ArgoProj Helm Charts

    Mustache 0 Apache-2.0 1,923 0 0 Updated May 28, 2024
  • NotebooksExamples Public

    This repository contains Jupyter notebook examples, intended to be linked with the SWAN Gallery

    Jupyter Notebook 1 Apache-2.0 1 0 0 Updated May 16, 2024
  • SparkPlugins Public

    Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are initialized. This also allows extending the Spark metrics systems with user-provided monitoring probes.

    Scala 88 Apache-2.0 15 3 1 Updated Apr 2, 2024
  • sparkMeasure Public

    This is a mirror of https://github.com/LucaCanali/sparkMeasure - sparkMeasure is a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics.

    Scala 14 Apache-2.0 3 0 0 Updated Mar 11, 2024

Top languages

Loading…

Most used topics

Loading…