Skip to content

AmitKus/SCIPY2019

Repository files navigation

Main Learnings:

  1. Start using Jupyter Lab (version 1.0 released) (and Stop using Python 2 completely!)
  2. Great conference for learning about tools and packages.
  3. Don't expect to learn any machine learning/data science.
  4. Good venue for recruiting (scientific software oriented developers/researchers)
  5. Scipy-India: Probable conference for BTC

Keynote: Stuart Geiger

A good infrastructure is hard to find.

Books

  1. Sorting things out
  2. The structure of scientific revolutions
  3. Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure

Links: Data Science Issues (non-technical)

  1. Data Science Careers in Academia
  2. Best Practices in Data Science
  3. Documentation in Data Science

Keynote: The New Era of NLP

New fast.ai course on NLP

sebastianruder/NLP-progress

spaCy: NLP Library

ULMFiT

Word Embeddings: Word2Vec, GloVe

XLNet

Wikitext: Wikepedia data

OpenAI GPT-2: Unicorn story, millitary spending discussion

Katie Jones: GAN (www.thispersondoesnotexist.com)

Censorship via information glut.

Any algorithm for detecting fakes can also be used to make better fakes.

How will we prevent A-Based Forgery? - Oren E

Computational Statistics

Weighted Bootstrap Sampling

Visual Diagostics at Scale

yellowbrick: model visualization

PYTHON Visualization 2019

Dashboarding solutions:

  1. Panel
  2. Viola
  3. Dash

Viztools overview

yt: Scientific simulation data viz, unstructured mesh etc. gr: language agnostic performant viz package. glue: Glue is a python project to link visualizations of scientific datasets across many files. real time data: vispy, holoviz

GANs Talk IBM Research

GAN Hacks

Model remodeling with Modern Deep Learning Frameworks

www.ethanrosental.com

Probabilistic Programming

Accelerated drug development

Packages

Plotly dash + iPywidgets (Python)

Viola (Langugae agnostic, Great live demo presentation)

bqplot (2d- plotting package)

ipyvuetify: for dashboarding

heroku: deployment

ipymaterialui

iframe: to embed in existing webpages.

HiGlass: Big data viz, time series etc

papermill: parametrize and execute notebooks

pyviz: un-opinionated viz collection of most python viz-libraraies

holoviz: opinionated

Panel: similar to viola, looks interesting

vaex: memory efficient dataframes looks interesting

cudf: Nvidia's cuda dataframe

Apache arraow: read about this

Numba: numpy compiled, fast (seems to have matured now, try it)

napari: image viewer

xmsmesh: meshing in python

Dask deployment:

  1. Kubernetics
  2. YARN
  3. HPC clusters

itkwidgets: 2D-3D viz

LFortran: Inline fortran in Jupyter!!

geopandas:

apricot: Submodular optimization for machine learning

yellowbrick: looks interesting

joblib: running Python functions as pipeline jobs

xarray: multidimensional array looks interesting

pandas: geopandas, pint, fletcher, cyberpnadas

Dask: Dask provides advanced parallelism for analytics, enabling performance at scale

pyarrow: read about this

pyOpenSci: guidelines to contributing in open source

visdom: A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

skorch: scikit-learn + pytorch

About

Notes from SCIPY 2019 Conference.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published