Large-scale Scientific Data Platform: Spark Interactive Notebooks, Real-time Rest APIs and Data Visualisation Dashboards. (Developed at ISC-PIF/CNRS)

  1. multivac-wikipedia

    Wonderful reusable codes, libraries and scripts to process Wikipedia page views by using Apache Spark.

  2. multivac-kaggle-titanic

    Simple example of Titanic competition by Spark 2.2

  3. multivac-nlp

    Testing and benchmarking some of the existing NLP libraries in Apache Spark

  4. multivac-fakenews

    Detecting users and communities which propagate fake news on Twitter by Apache Spark

  5. multivac-ml

    Pre-trained ML models in large-scale by Apache Spark

  6. es-punchcard

    Make punchcard charts based on Elasticsearch histogram aggregations

