Skip to content

Commit

Permalink
Add informational banner to README about forthcoming v0.9 (#1260)
Browse files Browse the repository at this point in the history
* Add informational banner to README about forthcoming v0.9

* Update README banner

* Update bullets in README banner
  • Loading branch information
bhancock8 committed Jul 12, 2019
1 parent c17c38a commit 1a78ed6
Showing 1 changed file with 23 additions and 4 deletions.
27 changes: 23 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,26 @@
<img src="figs/logo_01.png" width="150"/>

# ANNOUNCEMENT:
**Snorkel v0.9 is being released this summer!**

**_v0.7.0-beta_**
Why is it 0.9 when the last version was 0.7? Because this is more than just an incremental change. It's a full redesign from the ground up, including:
* Support for new training data operators: labeling functions (LFs), transformation functions (TFs), and slicing functions (SFs)
* A new matrix-completion-based approach for learning LF accuracies and correlation structure
* Native support for multi-task learning (MTL), transfer learning (TL), and complex model flows
* A Snorkel 101 guide that provides a gentle introduction to the technology and API for first-time users
* A fresh batch of tutorials demonstrating different use cases and integrations
* A more modular form factor that makes it easier to integrate with other libraries
* A commitment to stability, with full coverage unit tests, type checking, and doc strings

The new version will be available via `pip` and `conda` but **will not be backwards compatible**. The current (v0.7) Snorkel code will be moved to another repository to continue to support existing applications that depend on it.

As part of this refactor, we will be bringing under one roof a number of projects in the Snorkel ecosystem that have previously been posted in separate repositories—[Snorkel](https://github.com/HazyResearch/snorkel), [Snorkel MeTaL](https://github.com/HazyResearch/metal), [TANDA](https://github.com/HazyResearch/tanda), etc.—and which have been used to achieve state-of-the-art results on the [GLUE](https://dawn.cs.stanford.edu/2019/03/22/glue/) and [SuperGLUE](https://hazyresearch.github.io/snorkel/blog/superglue.html) benchmarks, automate [cardiac MRI classification](https://www.biorxiv.org/content/10.1101/339630v1) and [genetic research database curation](https://ai.stanford.edu/~kuleshov/papers/gwaskb-manuscript.pdf) (as featured in two forthcoming Nature papers), and extract information from electronic health record (EHR) data for national [medical device surveillance](https://arxiv.org/abs/1904.07640).

If you'd like to stay in the loop on the latest news in the Snorkel ecosystem, join the [Snorkel Google Group](https://groups.google.com/forum/#!forum/snorkel-ml). We'll keep you posted!

---

**_v0.7.0_**

[![Build Status](https://travis-ci.org/HazyResearch/snorkel.svg?branch=master)](https://travis-ci.org/HazyResearch/snorkel)
[![Documentation](https://readthedocs.org/projects/snorkel/badge/)](http://snorkel.readthedocs.io/en/master/)
Expand All @@ -11,14 +30,14 @@

* For the latest news, blog posts, tutorials, papers, etc. related to Snorkel, check out **[snorkel.stanford.edu](http://snorkel.stanford.edu)**!
* Get [set up](#quick-start) quickly
* Try the [tutorials](#tutorials)
* Try the [tutorials](#tutorials)
* Read the [documentation](http://snorkel.readthedocs.io/en/master/)


## Motivation

Snorkel is a system for rapidly **creating, modeling, and managing training data.**
Today's state-of-the-art machine learning models require _massive_ labeled training sets--which usually do not exist for real-world applications. Instead, Snorkel is based around the new [data programming](https://papers.nips.cc/paper/6523-data-programming-creating-large-training-sets-quickly) paradigm, in which the developer focuses on writing a set of labeling functions, which are just scripts that programmatically label data.
Today's state-of-the-art machine learning models require _massive_ labeled training sets--which usually do not exist for real-world applications. Instead, Snorkel is based around the new [data programming](https://papers.nips.cc/paper/6523-data-programming-creating-large-training-sets-quickly) paradigm, in which the developer focuses on writing a set of labeling functions, which are just scripts that programmatically label data.
The resulting labels are noisy, but Snorkel automatically models this process—learning, essentially, which labeling functions are more accurate than others—and then uses this to train an end model (for example, a deep neural network in TensorFlow).

<img src="figs/dp_neurips_2016.png" width="500" align="middle" />
Expand Down Expand Up @@ -219,7 +238,7 @@ It's usually most convenient to write most code in an external `.py` file, and l
```
A more convenient option is to add these lines to your IPython config file, in `~/.ipython/profile_default/ipython_config.py`:
```
c.InteractiveShellApp.extensions = ['autoreload']
c.InteractiveShellApp.extensions = ['autoreload']
c.InteractiveShellApp.exec_lines = ['%autoreload 2']
```

Expand Down

0 comments on commit 1a78ed6

Please sign in to comment.