Skip to content

feng-li/Distributed-Statistical-Computing

Repository files navigation

Distributed Statistical Computing (大数据分布式计算——教学讲义以及案例)

developed by

Feng Li
School of Statistics and Mathematics
Central University of Finance and Economics
feng.li@cufe.edu.cn

由中央财经大学统计与数学学院李丰建设。

Course Homepage (课程主页)

https://feng.li/distcomp/

Books (中文配套教材)

  • Distributed Statistical Computing for Big Data and Case Studies (大数据分布式计算与案例) ISBN:9787300230276

  • New version (In Preparation)

Contents (目录)

Teaching slides and demo code (with Jupyter Notebook format)

Quick View

You could view all the notebooks in this repository via the Jupyter Notebook Viewer

Run the demo locally

Requirements to run the notebook interactively

  • Python (>= 3.6.0)

    • findspark (invoke Spark from Python Session)
    • numpy, scipy, pandas
  • Hadoop (>= 2.7.0)

  • Hive (>= 2.3.3)

  • Spark (>= 2.3.1)

  • Jupyter Notebook (>= 5.0)

Distributed Statistical Computing Cases in markdown and tex format (out dated).