Sign up for your own profile on GitHub, the best place to host code, manage projects, and build software alongside 40 million developers.
Hide content and notifications from this user.
Learn more about blocking users
Contact Support about this user’s behavior.
Learn more about reporting abuse
This repository is dedicated to show different ways (split1.py, spit2.py, split3,py, split4.py , split,py and split6.py) to distribute a matrix (1000x4801 elements), which has been generated random…
Repository created for use in the RSE19 Microsoft sponsored workshop
This repository is dedicated to store different workflows that we have done for different scientific communities ( e.g., Seismologists, Astrophysics, etc) .
This repository explains how to set up a Spark (2.4.0) cluster in EDDIE HPC cluster ( University of Edinburgh) on demand. And addtionally, it shows how to submit Spark Text Mining queries to such S…
This repository is to store two versions of the same library that I designed for compressing at run-time MPI messages using different compression algorithms. The compression is adaptive, so it turn…
Code to analyse books and newspapers data using Apache Spark.
This repository describe the steps necesaries to create a Spark cluster within a PBS-job. We have tested those scripts using Cirrus HPC cluster, hosted at EPCC ( Universtiy of Edinburgh)
Presentations that I have prepared for giving a two days training Apache Spark at BGS
Enabling Complex Analysis of Large Scale Digital Collections (viz)
Sensor EMB datastreaming
Enabling Complex Analysis of Large Scale Digital Collections (code)
VERCE Science Gateway http://verce.eu
Working with iRods to analyse the Times Digital Archive
This repository is dedicated to store different training material that we have presented at different events. It contains presentations, as well several dispel4py workflows
Collection of the data science projects that I am working within BGS
Repo containing docs and outputs from the RSE4DataScience18 meeting.
Dockerized Hadoop HDFS with Yarn, Spark and Zeppelin
MINT: Model INTegration
An integration test of kafka-pyspark-elasticsearch using docker
Deploy ELK stack and kafka with docker-compose
The Mines Java Toolkit
A Mapping of Reprints and Re-use within the 19th Century Anglophone Newspaper Press
Dockerfile for Spark
Pegasus and dispel4py hybrid workflows for data-intensive science
Fetching, parsing, and analyzing data from the 1000 genomes project.