Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release Announcement #21

Closed
Yura52 opened this issue Jul 25, 2020 · 3 comments
Closed

Release Announcement #21

Yura52 opened this issue Jul 25, 2020 · 3 comments
Labels
discussion Opinions and ideas are especially welcome

Comments

@Yura52
Copy link
Owner

Yura52 commented Jul 25, 2020

UPDATE (5th June 2022)

  • The library is not called "DeLU"
  • At some point, a similar discussion will be created

UPDATE (10th June 2021)


Zero

I introduce Zero - a new general-purpose library for PyTorch users. Zero:

  • simplifies training loop, models evaluation, models application and other typical Deep Learning tasks
  • provides a collection of tools and leaves code organization to you
  • can be used on its own or together with PyTorch frameworks such as Ignite, Lightning, Catalyst and others

You can discuss Zero and share feedback anywhere in the web:

Useful links:

Motivation (why yet another library?)

(if you want to see some code, see the next section)

As a Deep Learning researcher, I have experience with bare PyTorch and Ignite. I am also aware of other frameworks and ideas behind them (Lightning is especially impressive with its wide range of out-of-the-box functionality). Each of them provides unique user experience and makes many cool things available "at the press of a button" (early stopping, checkpointing, logging, etc.). They also have one thing in common, namely, all of them provide some sort of abstraction over a training loop which means that you don't implement the loop by hand anymore.

However, I personally feel more comfortable when implementing training loops manually. I could start talking how important flexibility and simplicity are to me, but let's say that this is a matter of personal preferences. The problem is that many useful components of existing frameworks are tied to abstractions over a training loop (usually, in the form of callbacks) and cannot be used separately from those abstractions. In other words, if I want to use, for example, Early Stopping from the framework X, then I have to rebuild the whole project around the framework X. At the same time, I see no fundamental reasons why Early Stopping (as well as many other tools) cannot be a fully separate entity.

Overall, Zero aims to be rather a "library", than a "framework":

  • no "central entities/abstractions"
  • tools, provided by Zero, do not form an "ecosystem" and do not strongly depend on each other
  • it is possible to use those tools without the need to rewrite your project (including scenarios, when you use frameworks)
  • it is possible to replace those tools with custom alternatives at any moment when they don't fit your needs anymore

Quick Demo

Let's have a look at a simplified version of this classification task example (MNIST) in order to get an idea of I mean:

...
from zero.all import Eval, ProgressTracker, Stream, concat, learn, to_device, ...

def main():
    ...  # model, optimizer, data, device, etc.

    def step(batch):
        X, y = to_device(batch, device)
        return model(X), y

    def calculate_accuracy(loader):
        with Eval(model):
            logits, y = concat(map(step, loader))
        y_pred = torch.argmax(logits, dim=1).to(y)
        return (y_pred == y).int().sum().item() / len(y)

    stream = Stream(DataLoader(train_dataset, batch_size))
    progress = ProgressTracker(early_stopping_patience)

    while not progress.fail and stream.increment_epoch(n_epoches):
        for batch in stream.data(epoch_size):
            loss = learn(model, optimizer, F.cross_entropy, step, batch, True)[0]
            print(
                f'Epoch: {stream.epoch} '
                f'Iteration: {stream.iteration} Train loss: {loss}'
            )

        accuracy = calculate_accuracy(val_loader)
        progress.update(accuracy)
        if progress.success:
            torch.save(model.state_dict(), best_model_path)

    model.load_state_dict(torch.load(best_model_path))
    test_accuracy = calculate_accuracy(test_loader)
    ...

What we see from the example:

  • Stream:
    • manages only things related to the loop (epoch, iteration, data source) and knows nothing about Deep Learning
    • supports custom epoch size (stream.data(epoch_size))
    • additionally
      • enables other forms of loop:
        while True:
            x = stream.next()  # get the next batch/item
            ...
            if stream.iteration % frequency == 0:
                ...
        
        # or
        
        for x in stream.data(math.inf):
            ...
            if stream.iteration % frequency == 0:
                ...
      • allows to change data source on the fly (stream.set_loader(new_loader))
      • (not implemented: issue) allows to save and restore loop's state
  • ProgressTracker detects moments:
    • when there is no progress for too many updates (a.k.a. "early stopping": while not progress.fail)
    • when your best score is updated (if progress.success)
  • Eval switches models to the evaluation mode and turns off gradients
  • concat enables easy batchwise application of models and functions
  • ... etc ...
  • and you can replace any of the mentioned tools with custom implementations at any moment (for example, learn is just the "default" training step and does not even try to be universal solution that fits everyone)
  • and many of the mentioned tools can be used in projects based on Lightning/Ignite/Catalyst/etc.

Overall, we see patterns we all know, but the code is more concise and the amount of non-informative noise is reduced. We also see that the tools from Zero are used independently of each other and are not tied to any central entity.

Proof of concept

Now, an important note. The release (v0.0.2) is only a proof of concept of the ideas described above. It means that:

  • the main goal is to share ideas in a form of actual code, not just on paper; as of now, the overall functionality, admittedly, is very limited
  • the project is tested with 90%+ coverage, but there is no continuous integration, coverage monitoring, etc.
  • API, the package structure and everything else can be changed in future versions
  • exceptions are raised without any messages (however, all conditions for exceptions are documented)
  • etc.

In other words, the project is not that mature and polished. However, it should work just fine for research projects.

What are the next steps?

The main motivation for the release was my idea that "such a library should exist", so I developed a minimal version in order to see what it looks like when actually implemented. Although there is an issue tracker and the "0.1.0" milestone, I feel that Community is "the one" who will finally decide (in a form of feedback, contributions, likes, stars etc.) how actively the project will be developed. Overall, the main question is the following: Do people need such a library?

As of now, the functionality is limited, but it is self-sufficient, documented and tested. I am using Zero in my new research project and I really like how it feels 😎

P.S. Why "Zero"?

click

@Yura52 Yura52 added the discussion Opinions and ideas are especially welcome label Jul 25, 2020
@Yura52 Yura52 pinned this issue Jul 25, 2020
@t-vi
Copy link

t-vi commented Jul 28, 2020

Hi!
I like the library idea and do like to spell out my training loop.

One thing:

while not progress.fail and stream.increment_epoch(n_epoches):
        for batch in stream.data(epoch_size):

I think it would be much more convenient to combine progress and stream epochs into an iterator to have something like

for _ in stream.epochs_with_progress(...):
        for batch in stream.data(epoch_size):

(with the value the epoch number or so) or even have an epoch object that gives the iteration over batches.

@Yura52
Copy link
Owner Author

Yura52 commented Jul 28, 2020

Hi! Thanks for the idea, created an issue. I agree that the "default" training loop could feel more ergonomic.

Repository owner locked and limited conversation to collaborators Jun 17, 2021
@Yura52 Yura52 closed this as completed Jun 4, 2022
@Yura52
Copy link
Owner Author

Yura52 commented Jun 4, 2022

The library is not called "DeLU". At some point, a similar discussion will be created for DeLU.

@Yura52 Yura52 unpinned this issue Jun 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
discussion Opinions and ideas are especially welcome
Projects
None yet
Development

No branches or pull requests

2 participants