Skip to content

blackhat06/ebolabigdataorg

Repository files navigation

ebolabigdata.org is a data science for humanitarian project, which is dedicated to solving real world problems related to Ebola and it's impacts.

List of Languages and tools used:
  • R :- predictive modeling, reactive graphics, Shiny
  • Python :- machine learning, data analysis and web data mining
  • Hadoop: data collection
    • pig
    • Datafu
  • Gaffi:- graph visualization
  • Tableau:- data visualization
  • Bloomberg L.P. API:- realtime feed of stock market data
  • Node.js :- used for Watson, Bloomberg L.P. API
  • Chef:- automated deployment Digitalocean and AWS
We are using R 3.1.0, because it support's all important packages as latest version doesn't support.
Also, R version >= 3.0 doesn't have problem large vector size upto 64 GB as version 2.15.3 support vector size upto 2GB. If dataframe size > 64Gb; use bigmemory package.
R packages
  • ggplot2 :- Graphics in R
  • reshape :- restructuring the data
  • plyr :- for apply function like S+
  • gdata :- reading xls
  • rggobi :- GGobi and R
  • ggmap :- spatial visualization
  • melfy :- model based clustering
  • knitr :- report generation
  • stringr :- easier to work with strings
  • osmar :- OpenStreetMap and R
  • twitteR :- twitter api
  • shiny :- web application framework
  • PIN :- aeasuring asymmetric in in financial markets
  • bigmemory:- exceed available RAM
Philosophy
  • less is more
  • better cognitive visualization
  • effective modelling
  • precise transformation
  • data scinece as a ecosystem such as Python, R, Java, Js, Sql, Regex , DevOps, Xpath
Contribute

Please do send in Pull Requests for source code ,suggestions, typographic errors, and hyperlinks.

Contributors
Vikash Ruhil 

About

This a humanitarian project based on Ebola information and data analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published