Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

♻️ Re-design package and introduce large parts of the API #90

Merged
merged 25 commits into from Jun 21, 2022
Merged

Conversation

Koncopd
Copy link
Member

@Koncopd Koncopd commented Jun 16, 2022

This PR largely follows the design spec here: https://www.notion.so/laminlabs/50e2e172242b4c609f600f6a282831c2.

Summary:
Added developer API to nbproject.dev with:

  • infer_dependencies for dependencies inference from a notebook.
  • initialize_metadata for nbproject metadata initialization.
  • check_integrity for integrity check: Add an integrity check functionality #70
  • read_notebook, write_notebook (along with the notebook model Notebook) for jupyter notebook io.
  • Moved notebook_path there.

Added testing API nbproject.dev.test with execute_notebooks, which executes all notebooks in a given folder.

Added metadata API to nbproject.meta with

  • store a wrapper around nb.metadata["nbproject"].
  • live API-level access to metadata computed at run-time with .time_run (current date & time), .time_passed (compute time spent within the notebook), .integrity (current integrity status of the current notebook), .dependency (dependencies inferred from current notebook content & environment).

Notes:

  • nbproject.dev is structured to avoid loading heavy modules (_dependency.py) on init and avoid the need to add things to dev __init__.py as it is impossible to load without executing __init__.py. It is possible, for example, to do from nbproject.dev import test; test.execute_notebooks(), it is impossible to do from nbproject.dev.test import execute_notebooks.

@Koncopd Koncopd marked this pull request as draft June 16, 2022 15:28
@github-actions
Copy link

github-actions bot commented Jun 16, 2022

@github-actions github-actions bot temporarily deployed to pull request June 16, 2022 15:29 Inactive
@Koncopd
Copy link
Member Author

Koncopd commented Jun 16, 2022

Metadata API is still in process.

@github-actions github-actions bot temporarily deployed to pull request June 17, 2022 15:00 Inactive
@github-actions github-actions bot temporarily deployed to pull request June 17, 2022 15:51 Inactive
@github-actions github-actions bot temporarily deployed to pull request June 17, 2022 15:55 Inactive
@github-actions github-actions bot temporarily deployed to pull request June 17, 2022 16:13 Inactive
@github-actions github-actions bot temporarily deployed to pull request June 17, 2022 16:18 Inactive
@github-actions github-actions bot temporarily deployed to pull request June 17, 2022 16:48 Inactive
@Koncopd Koncopd marked this pull request as ready for review June 17, 2022 17:39
@Koncopd
Copy link
Member Author

Koncopd commented Jun 17, 2022

Still need to add docs.

@falexwolf falexwolf changed the title ♻️ Refactor and add API ♻️ Re-design and introduce large parts of the API Jun 18, 2022
@falexwolf
Copy link
Member

This is all great work, Sergei!

@github-actions github-actions bot temporarily deployed to pull request June 18, 2022 16:34 Inactive
def infer_dependencies(content: Union[Notebook, list], pin_versions: bool = True):
"""Parse the notebook content and infer all dependencies.

Params
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sergei, I'm sorry for not having clearly communicated this before. We're doing Google-style docstrings across all packages. With the advent of type annotations in the signatures, Google's more compact format works better. I'm very happy to convert all of the numpy-style docstrings to Google-style but I don't want to interfere with your PR. 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes only affect the param-lists, which are present in about 4 docstrings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, i will change, but i am pretty sure i saw numpy style in other repos.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we started out with numpy-style, but changed about a month ago. Besides looking more compact while remaining readable, Google-style is also faster to write! Hence that decision.

@falexwolf falexwolf changed the title ♻️ Re-design and introduce large parts of the API ♻️ Re-design package and introduce large parts of the API Jun 19, 2022
@falexwolf falexwolf changed the title ♻️ Re-design package and introduce large parts of the API ♻️ Re-design package and introduce large parts of the API Jun 19, 2022
@github-actions github-actions bot temporarily deployed to pull request June 20, 2022 17:43 Inactive
@falexwolf
Copy link
Member

Sorry for the slow response! Yes, your last solution seems the right way to do it!

Document an un-initialized class .Meta and have .meta point to an instance that's specific to a given notebook.

I've used that strategy many times.

@@ -32,7 +31,7 @@
__version__ = "0.0.9"

from ._header import Header # noqa
from ._meta import Live, Meta, Metadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is Metadata as opposed to Meta? 🤔

We didn't go this granular in the design specification and you also don't explain it in the PR description.

Copy link
Member Author

@Koncopd Koncopd Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry, meta is an object of the class Meta. This object is initialized on import with the filename, time_run of the current notebook. And Metadata is just a pydantic model for nbproject.metadata. Is the name confusing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing that there is both Meta and Metadata.

I now understand that the latter is a mere shallow wrapper of the notebook metadata itself.

Hence, Meta is the nbproject metadata, whereas Metadata is the nbformat metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the typical user won't see Metadata, so we could give this a longer name to disambiguate it for the developers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about MetaJSON or MetaFormat or JSONMeta or NBFormatMeta. Or just everything lowercase for simplicity?

Copy link
Member Author

@Koncopd Koncopd Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the variants look ugly to me. NBProjectMetadata maybe? Or MetaStore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MetaStore is great if that's what is returned in Meta.store!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great anyway!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the perfect name for this!

@github-actions github-actions bot temporarily deployed to pull request June 20, 2022 18:04 Inactive
@falexwolf
Copy link
Member

In the commit, I'm making a mere suggestion on listing all classes on one page instead of generating page stubs.

The page stubs are great for complex classes with many attributes and methods.

The classes here are not so complicated and the user can probably parse the design most rapidly if it's on a single page.

@github-actions github-actions bot temporarily deployed to pull request June 20, 2022 18:16 Inactive
from ._notebook import Notebook


class Metadata(BaseModel):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another good name could be MetaBase or MetaRaw, as that would suggest that it's upstream of nbproject.Meta.

Copy link
Member

@falexwolf falexwolf Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you do the rename to MetaStore, you can also add a docstring.

"""The metadata stored in the notebook file ."""

Or something like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation will automatically show that the base class is pydantic.BaseModel.


nb_meta = read_notebook(filepath).metadata
if "nbproject" in nb_meta:
self._store = Metadata(**nb_meta["nbproject"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly! MetaStore is the perfect name here! 🤩

return self._live

@property
def store(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could then add a return type MetaStore here.

if filepath is None:
filepath = notebook_path()

self._live = Live(filepath, time_run)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest you rename Live to MetaLive and then everything will be named in a parallel fashion.

self._store = None

@property
def live(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then you can add a return type MetaLive and it should render beautifully in the docs!

@github-actions github-actions bot temporarily deployed to pull request June 20, 2022 19:28 Inactive
@github-actions github-actions bot temporarily deployed to pull request June 20, 2022 19:35 Inactive
@falexwolf
Copy link
Member

Awesome! This looks super great now! 😅

@falexwolf
Copy link
Member

Would you want to merge and @sunnyosun & I test it on main?

@github-actions github-actions bot temporarily deployed to pull request June 21, 2022 09:58 Inactive
@Koncopd
Copy link
Member Author

Koncopd commented Jun 21, 2022

@falexwolf i am merging this.

@Koncopd Koncopd merged commit 2f54f4b into main Jun 21, 2022
@Koncopd Koncopd deleted the api branch June 21, 2022 10:02
from ._header import _filepath, _time_run
from ._logger import logger


def get_title(nb: Notebook) -> Union[str, None]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😆

Copy link
Member Author

@Koncopd Koncopd Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, it is displayed as Optional in the docs anyways, even if written as Union.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know! But that's going to go away with Python 3.10. We'll have beautiful str | None then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know!

@falexwolf
Copy link
Member

Thank you, @Koncopd! Are you down for celebrating this? Getting a beer somewhere in the next few days or so?

@Koncopd
Copy link
Member Author

Koncopd commented Jun 21, 2022

@falexwolf yep, why not, Friday?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants