Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/49 dynamic schema generation #69

Merged
merged 11 commits into from
Nov 15, 2022

Conversation

zeyus
Copy link
Member

@zeyus zeyus commented Nov 14, 2022

Fixes #49

So, we now have a fluent, SQLAlchemy compatible interface to dynamic schemas per user per datasource. These are stored in unique SQLAlchemy SQLite database files.

@zeyus zeyus linked an issue Nov 14, 2022 that may be closed by this pull request
id = Column(Integer, primary_key=True)
user = Column(Integer, ForeignKey("user.id"))
data_source_id = Column(String(80), unique=True, nullable=False)
user_id = Column(Integer, nullable=False)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I def shouldn't need a user id here, that's already in the user relationship, fixing now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one further comment here, that column has been removed, but we might need a column or columns at some point to store information about the data source, like which columns are used in analysis or filtering etc, but that most likely belongs in the controller / model of the specific analysis, not in the data source / data source manager.

@zeyus
Copy link
Member Author

zeyus commented Nov 14, 2022

all suggestions, comments, improvements, criticisms welcome! :D

os.makedirs(ds_dir)
# generate a unique filename from a hash of the data source id
# this is to prevent a user from accessing another user's data
ds_id_hash = hashlib.sha256(self._data_source_name.encode()).hexdigest()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I should make the hash not based on the data source name, because the user might change it (if we let them)...

The ID might be better, but it's also predictable, this might not matter too much because the data directory should never be publicly accessible, so I think we should change it to ID rather than name, as ID should be unique.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any circumstance why the ID of the user may change? some migration? Docker reasons?

Copy link
Member Author

@zeyus zeyus Nov 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not in any way that will affect things, migrations won't change user IDs, and independent installations can have different IDs but they will also have their own data sources and their own users so it won't be an issue

@zeyus zeyus merged commit 9044838 into develop Nov 15, 2022
@zeyus zeyus deleted the feature/49-dynamic-schema-generation branch November 15, 2022 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dynamic schema generation
2 participants