Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: addition of class model support #201

Merged
merged 1 commit into from
May 8, 2019

Conversation

jma
Copy link
Contributor

@jma jma commented Dec 22, 2018

@jma jma force-pushed the class_model branch 8 times, most recently from a6c42f3 to 640b9b8 Compare December 23, 2018 14:50
tests/test_invenio_records.py Show resolved Hide resolved
setup.py Show resolved Hide resolved
@ntarocco
Copy link
Contributor

ntarocco commented Jan 8, 2019

@jma nice! I didn't check in details the changes yet, but I see that RecordMetadata class is hardcoded (unfortunately) in other Invenio modules, especially for ForeignKeys: https://github.com/search?q=org%3Ainveniosoftware+RecordMetadata&type=Code

I guess this is a major breaking change and all modules need to updated at the same time. Did you think about this too? (ping @lnielsen @egabancho @slint )

@jma jma requested review from lnielsen and ntarocco January 13, 2019 07:15
@jma
Copy link
Contributor Author

jma commented Jan 14, 2019

@ntarocco it should not change anything in other packages, it will just allow to create a new class with an other table name if someone want to do it.

@egabancho
Copy link
Member

@jma I think the main problem comes when other modules use this model directly for something, i.e. invenio-records-files, in this cases if you try and use a different table for your records I am not sure the foreign keys will work (perhaps I am missing something here)
I came across this peace of documentation from SQLAlchemy which maybe it's worth taking a look, most probably you've read it already ☺️

@ntarocco
Copy link
Contributor

low to create a new class with an other table name if someone want to do it.

I agree with you, but before going ahead with this solution and merge and release, I would try to think how we resolve the references for ForeignKeys/queries of other modules, because we might have to go a little bit further with the implementation.

I am thinking that if I want to use this code to have another table for records, I am still stuck with files references for example.
Maybe, and just maybe (didn't give a real thought about it), we need to put the model class in a config, so other modules can then use the config to get the model class, even if it might be problematic...
@jma WDYT?

@egabancho already commented a similar thing, ping @lnielsen and @slint if they have other comments.

@jma
Copy link
Contributor Author

jma commented Jan 16, 2019

@ntarocco form my point of view, it is very important to find a solution to split records in different tables for several reasons:

  • the final table will be too big, lets imagine to put authorities, bibrecord, items and loans in the same table, you will ends with several dozen millions of records
  • some type of data need versioning such as bib records others not such as statistics, transactions
  • some of them needs the JSON column to be indexed to avoid ES index creation, such as libraries, or item types, etc.

I tried several approaches as describe in the documentation, but I don't find any solution by derivating the RecordMetadata class as it implies SQL inheritance. It is especially useful when you want to add a new column in the child and keeping the inherited columns in the parent table. It can be used to add foreign key for example. But, it seems impossible to have all the column of the child in a new table.

I understand you point about invenio-records-files and other invenio modules. However, if we introduce a new feature in a existing modules, we need to be sure that it is back compatible. If we have to wait that new feature is available or used by all the dependency module, it makes any new features introduction difficult.

@slint
Copy link
Member

slint commented Jan 16, 2019

@jma All your points are valid and have been discussed in a similar fashion IRL in the past. It is indeed important to address these issues, we have encountered them in Zenodo and even had operational issues that were made worse by the fact that records_metadata (and its versioning "shadow"-table) might grow to be enormous tables after some time (in row count and disk size).

We can merge this PR since indeed it keeps things nice and tidy in terms of backward and cross-module compatibility (i.e. nothing will break even if we merge and release a new version).

I guess our main concern is about eventual next steps (in this module and friends) and how to plan them out to avoid shooting ourselves in the foot and end up in situations similar to this one, where a somewhat "simple" decision to create a single JSON-blob RecordMetadata model and then rely on it for FK relationships might now require DB migrations with downtime and all sorts of other nasty stuff for existing running services (wrt invenio-records-files).

We're all willing/expecting to do this anyways in one way or another for the sake of new and better features, but it's easier to accept and embrace if we are all aligned with it and understand the complications and potential solutions.

tests/conftest.py Outdated Show resolved Hide resolved
@@ -137,6 +137,8 @@ def dumps(self, **kwargs):
class Record(RecordBase):
"""Define API for metadata creation and manipulation."""

class_model = RecordMetadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that critical, though naming-wise I'm a bit conflicted about class_model... One would say it's "the model of the class" (thus class_model sounds good), though I feel that something like model_class/model_cls would better express the "internal"/"programmatic" nature and use of the field (cc @inveniosoftware/architects).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for model_cls

It would be nice to include a docstring and add a section to docs when to use it.

@jma jma force-pushed the class_model branch 4 times, most recently from d269b49 to e442626 Compare January 25, 2019 13:54
@lnielsen
Copy link
Member

@ntarocco We need a v1.1 release before merging so this should go into a v1.2

@zzacharo
Copy link
Member

@jma https://travis-ci.org/inveniosoftware/invenio-records/jobs/514564803#L1060 is failing probably because of werkzeug update. Could you please check it? Otherwise the PR is ready to go!

Copy link
Member

@jirikuncar jirikuncar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice generalisation 👍


Q: Is there a way to add a table to ignore list in invenio-db?

* closes: inveniosoftware#191

Signed-off-by: Johnny Mariéthoz <johnny.mariethoz@rero.ch>
@ntarocco ntarocco merged commit 84248d5 into inveniosoftware:master May 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ModelClass as class variable for the Record class
7 participants