-
-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add schema generation for extracted database #34
Comments
Would it really make sense to extract a partial data model without the unrelated tables? |
Excellent questions and it would depend on the context. For database application testing, probably not. For data extraction from the database used in scientific lab for publishing, it makes great sense. In the lab, we are using the same database for linking data coming from different experiments. The database is used by several smaller applications that work only on the part of it. So, if I want to publish datasets describing one particular study, I would like to extract the data that is relevant to that study only. This can be done by Jailer through definition of the model and the relevance judged by the researcher. As a part of extraction, I don't need to define database schema with all possible experiments going on as it is irrelevant in the current publication. It would be preferred to extract only the relevant schema as well. Please let me know if I missed something in your questions. |
I see, that makes sense. Thanks for the insight. |
I agree, that could be difficult to do in database-independent way. We are using PostgreSQL. Now thinking of it, maybe I would just need the list of exported tables as I can use |
This information is contained in the generated SQL script and could theoretically be extracted from it:
However, all tables from which no rows were exported (e.g. because the table in the source-database happens to be empty), but which are still relevant would be missing. Perhaps a cli-command would be useful, which would return a list of all tables that are potentially (transitively) related to the subject table? |
This list would be useful for me as well. I generate reports about our tables and databases to help manage them. A list of the tables that are transferred between databases using Jailer would be very nice. I could not find an easy way to generate that list. |
It would be useful, indeed. Ideally, it should stop associations according to the extraction model. Thus, if we have tables
and user disconnected |
In the next release there will be the CLI tool "print-closure":
If you want to test this in advance, you can unzip the file in the attachment and replace the file "jailer.jar" with it. |
I did a quick test with our largest model (118 tables) and it works. |
Excellent, worked for me as well - exactly as expected. Please feel free to close the issue and thank you very much for your help! |
Available in release 9.5.6. |
Hi @Wisser I am trying to use the CLI tool but I get an error: 2022-03-23 11:44:47,788 [main] ERROR - './extractionmodel/LT_Canada.jm' does not exist
java.io.FileNotFoundException: './extractionmodel/LT_Canada.jm' does not exist
at net.sf.jailer.extractionmodel.ExtractionModel.loadDatamodelFolder(ExtractionModel.java:522)
at net.sf.jailer.Jailer.updateDataModelFolder(Jailer.java:383)
at net.sf.jailer.Jailer.jailerMain(Jailer.java:274)
at net.sf.jailer.Jailer.main(Jailer.java:149)
Error: java.io.FileNotFoundException: './extractionmodel/LT_Canada.jm' does not exist
Arguments: 0: {print-closure}, 1: {./extractionmodel/LT_Canada.jm}
2022-03-23 11:44:47,797 [main] ERROR - working directory is /opt/jailer-database-tools/lib/app The model file definitely exist and is in I have installed Jailer from the Arch Linux User Repository. It is installed in /opt. The gui works fine but I'm having trouble with the command line tools. Looks like it is using the wrong working directory. Any idea? Thanks! |
Hi @rbeucher The script |
Is your feature request related to a problem? Please describe.
I am evaluating Jailer as a tool to extract part of a database for publishing it. In research, when we collect the data in the lab databases as a part of experimental and analysis routine, it would be helpful if I can extract part of a data together with the schema of the related tables (ideally views and functions as well). Jailer looks to be perfect fit. I can define which table data to extract, associated extraction model, and generate SQL statements. Unfortunately, that generated SQL does not include schema creation.
Describe the solution you'd like
When starting Export tool:
Describe alternatives you've considered
Alternative would be to generate schema dump by database utilities and manually remove all unrelated tables. Sounds like an error-rich solution, though.
Additional context
In principle, Jailer could be used as a part of the publishing model for scientific databases. The generated SQL statements could be published as they are or used to generate extracted database and publish that.
The text was updated successfully, but these errors were encountered: