Skip to content

Treon chokes on notebooks that refer to relative paths #12

Closed
@natanlao

Description

@natanlao

We're using treon for HumanCellAtlas/data-consumer-vignettes, and ran into a problem where notebooks that refer to relative paths can unexpectedly fail when tested with treon.

These notebooks expect the current working directory to be the directory that the notebook resides in. When using treon, this is not likely to be the case, since treon searches recursively for notebooks to test. (In this case, the current working directory is the directory where treon was invoked.)

I've been able to solve this locally by changing the working directory to the directory in which the notebook resides before testing the notebook. That said, this solution only works if treon is limited to a single thread (or only testing one notebook), since the current working directory is shared across all threads.

There are a few approaches to this that I can think of:

  1. Refactor treon to use multiprocessing instead of multithreading. Skimming the source code, it seems like this change would be more or less trivial. Using multiprocessing would have the benefit of working directory isolation, in addition to potential performance improvements for testing CPU-bound notebooks by circumventing the GIL, at the cost of some performance overhead.

  2. Add an option to perform the same directory-switching that I used above that limits the number of parallel threads to one.

  3. Ignore this problem - our workaround is to handle the directory-switching ourselves, running treon with one notebook at a time, while still achieving parallelism with xargs.

I'm not sure if this is a widespread use case, and I'm happy to take a shot at either of these myself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions