-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Rename, reorganize schema module #1963
Conversation
for more information, see https://pre-commit.ci
…sbrar/ludwig into refactor_remove_marshmallow_extraction
Could you add your directory structure description as header
|
Added |
ludwig/schema/schema.py
Outdated
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# Module structure: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ludwig/schema/schema.py
is weird. can you move this function into ludwig/schema/__init__.py
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this actually causes a circular import error
I don't know enough about how python does submodule initialization but it seems as though to import schema.utils (which is used in e.g. the combiner and trainer files) it will also pull in that init file (which itself pulls in schemas from those trainer/combiner files)
something I talked about with @justinxzhao in the past was moving the trainer, combiner, optimizer schemas into this new ludwig.schema module as well. I was going to do that in a follow up PR, but should I do this now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to move the other schemas over into this module, per conversation with @tgaddair
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, just two things to be fixed.
for more information, see https://pre-commit.ci
Redo of #1936 (which was reverted by 6d8474a) . This ends up being a significant reorganization:
Changes
ludwig.utils.schema
intoludwig.marshmallow
, and (2) Renamesludwig.marshmallow
module toludwig.schema
andludwig/marshmallow/marshmallow_schema_utils.py
toludwig/schema/utils.py
. Also, for convenience, all of the functionality inside of the currentschema.py
(namely,get_schema()
which returns the full Ludwig schema andvalidate_config()
) lives insideludwig/schema/__init__.py
to make imports elsewhere more readable.TrainerConfig
andget_trainer_jsonschema()
intoludwig/schema/trainer.py
.ludwig/modules/optimization_modules.py
and intoludwig/schema/optimizers.py
'sequence_encoder_registry
out ofludwig/combiners/combiners.py
, converts it into a full-fledgedRegistry()
, and places it inludwig/encoders/registry.py
.ludwig/combiners/combiners.py
and into a newludwig/schema/combiners/
directory (module) with a file per individual combiner. Again, to make things more readable elsewhere,__init__.py
is set so that a user can import directly from this submodule (e.g. a user can writefrom ludwig.schema.combiners import ConcatCombinerConfig
).Misc.
train_with_backend
integration test.Overview:
schema
module is the following:Do not merge until after #1961 goes through!Ready for review.