- Start using Jupyter Lab (version 1.0 released) (and Stop using Python 2 completely!)
- Great conference for learning about tools and packages.
- Don't expect to learn any machine learning/data science.
- Good venue for recruiting (scientific software oriented developers/researchers)
- Scipy-India: Probable conference for BTC
A good infrastructure is hard to find.
- Sorting things out
- The structure of scientific revolutions
- Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure
New fast.ai course on NLP
sebastianruder/NLP-progress
spaCy: NLP Library
ULMFiT
Word Embeddings: Word2Vec, GloVe
XLNet
Wikitext: Wikepedia data
OpenAI GPT-2: Unicorn story, millitary spending discussion
Katie Jones: GAN (www.thispersondoesnotexist.com)
Censorship via information glut.
Any algorithm for detecting fakes can also be used to make better fakes.
How will we prevent A-Based Forgery? - Oren E
Weighted Bootstrap Sampling
yellowbrick: model visualization
Dashboarding solutions:
- Panel
- Viola
- Dash
yt: Scientific simulation data viz, unstructured mesh etc. gr: language agnostic performant viz package. glue: Glue is a python project to link visualizations of scientific datasets across many files. real time data: vispy, holoviz
Accelerated drug development
Plotly dash + iPywidgets (Python)
Viola (Langugae agnostic, Great live demo presentation)
bqplot (2d- plotting package)
ipyvuetify: for dashboarding
heroku: deployment
ipymaterialui
iframe: to embed in existing webpages.
HiGlass: Big data viz, time series etc
papermill: parametrize and execute notebooks
pyviz: un-opinionated viz collection of most python viz-libraraies
holoviz: opinionated
Panel: similar to viola, looks interesting
vaex: memory efficient dataframes looks interesting
cudf: Nvidia's cuda dataframe
Apache arraow: read about this
Numba: numpy compiled, fast (seems to have matured now, try it)
napari: image viewer
xmsmesh: meshing in python
Dask deployment:
- Kubernetics
- YARN
- HPC clusters
itkwidgets: 2D-3D viz
LFortran: Inline fortran in Jupyter!!
geopandas:
apricot: Submodular optimization for machine learning
yellowbrick: looks interesting
joblib: running Python functions as pipeline jobs
xarray: multidimensional array looks interesting
pandas: geopandas, pint, fletcher, cyberpnadas
Dask: Dask provides advanced parallelism for analytics, enabling performance at scale
pyarrow: read about this
pyOpenSci: guidelines to contributing in open source
visdom: A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.
skorch: scikit-learn + pytorch