Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] post_table by columns #80

Open
juliomateoslangerak opened this issue Jun 27, 2023 · 1 comment
Open

[FEATURE] post_table by columns #80

juliomateoslangerak opened this issue Jun 27, 2023 · 1 comment

Comments

@juliomateoslangerak
Copy link
Contributor

I've been trying the post table and it feels a bit unnatural to me to pass list of rows, probably because I was previously using the API. I had worked out some helper functions to add tables as lists of columns, very much like what you would do with the API.

I had created a "post_table" that would be something like this:

def post_table(conn, table_name, column_names, column_descriptions, values, types=None, namespace=None,
                          table_description=None)

conn: you know it already
table_neme: A name for the table file to which some random text is added as a suffix
column_names: a list of strings with the names of the columns
column_descriptions: a list of strings with the descriptions for the columns (should be optional I presume)
values: a list of lists with the values. There is some internal logic to check for the coherence of the data
types: a list of types. To be honest I never use it because the function is checking for the type of the first element in the values list (not really neat but not an issue in my case) and creating the appropriate columns type.
namespace: you also know it
and table description: too obvious to write too much about it. oops I already did.

Following ezomero's logic we should pass the object id to which it should be linked

I use this function for creating tables from pandas. very easily like:

table = post_table(conn, "my_table",
                   column_names=my_df.columns.tolist(),
                   column_descriptions=my_df.columns.tolist(),  # now you see why this should be optional. Too lazy to write a description for every column
                   values=[my_df[c].values.tolist() for c in my_df.columns],
                   types=None,
                   namespace="my_super_analysis_workfow",
                   table_description="so lazy to write something"
                   )

On top of this, the function is checking if some column name is image_id, imageId, image id,... and type int, it is going to create an image column. Very useful for parade, etc. Same logic for Dataset Id and ROI Id. This behaviour could be avoided with an optional bool like detect_object_cols=True.

What do you think about this?

@erickmartins
Copy link
Collaborator

We have no plans on implementing per-column tables right now - PRs are always welcome, of course!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants