Skip to content

techref: Pub types#1

Merged
gabestein merged 2 commits into
mainfrom
techref-06-21-23
Jun 22, 2023
Merged

techref: Pub types#1
gabestein merged 2 commits into
mainfrom
techref-06-21-23

Conversation

@gabestein
Copy link
Copy Markdown
Member

No description provided.

@gabestein gabestein requested a review from isTravis June 21, 2023 15:22
@isTravis isTravis requested review from 3mcd and isTravis and removed request for 3mcd and isTravis June 21, 2023 18:52
Copy link
Copy Markdown
Member

@isTravis isTravis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on the notes. Could we add other devs as reviewers so they can comment in case there were any other salient takeaways in their mind?

Probably don't need approval from everyone to merge, but might be useful to generate the notification for them.

@gabestein
Copy link
Copy Markdown
Member Author

gabestein commented Jun 21, 2023

Oh, forgot to mention — we can only add two reviewers. There's some setting on the repo that we need to add that I couldn't find...perhaps you need to as the creator?

@3mcd
Copy link
Copy Markdown
Collaborator

3mcd commented Jun 21, 2023

Looks great! But the proposed schema changes differ a bit from where I thought we had landed on pub types.

It looks like a community should define all possible pub types upfront. I wonder if this will cause a configuration issue for communities with several custom metadata fields, perhaps added to a pub across multiple steps in a workflow.

Preprint, ReviewedPreprint, ReviewedPreprintWithAIGeneratedHeader, etc.

This is a stretch, but if a community used integrations to generate an AI-generated header and tags/keywords for pubs concurrently, they might have to define each possible permutation of a pub, e.g.

  • ReviewedPreprintWithAIGeneratedHeader
  • ReviewedPreprintWithKeywords
  • ReviewedPreprintWithAIGeneratedHeaderAndKeywords

Ingestion also complicates things. A pub might have many initial states when generated by user input into or by scraping meta tags from a page. The first step in the workflow might only care about a title, abstract, and authors list, while other fields might not be used until a later step where an integration makes a pivotal decision based on the presence those fields.

Just some thoughts. I'm hesitant to say pubs should be completely dynamic and integrations should be polymorphic just for the sake of it, but it might be useful if we plan on supporting uses like the ones I outlined above

@isTravis
Copy link
Copy Markdown
Member

I might be missing the reason why the simple approach won't work - but I had been imagining a Pub's type to be singular and consistent throughout a pub's lifetime, even as it acquires metadata through different stages of workflows. Using the context of your example, I imagine just a single Preprint type existing, e.g.

typePreprint = {
   title: string;
   keywords: string[];
   reviews: object[];
   header: string;
}

And the initial "preprint" will just be

{
  title: null,
  keywords: [],
  reviews: [],
  header: null
}

Integrations and such will have to both check that the type can have the fields it needs, and then that they're not null to continue.

Initial stages may only care that the type supports title, as you suggest, while later workflow stages (and their integrations) require more fields to exist on the type and the specific instance of a given pub.

@3mcd
Copy link
Copy Markdown
Collaborator

3mcd commented Jun 21, 2023

I guess I assumed that the ultimate type of a pub could be unknown at the time it's created. e.g. a biorxiv admin decides a submission is COVID-related and adds some metadata fields to its pub based on that.

@kalilsn
Copy link
Copy Markdown
Contributor

kalilsn commented Jun 22, 2023

The notes look good, and thanks for the schema example!

Reading this schema, I would have expected prisma to generate a query with a number of joins to get the metadata for these pubs. Interestingly, it instead uses a series of queries with WHERE/IN:

prisma:query SELECT "public"."communities"."id", "public"."communities"."name", "public"."communities"."created_at", "public"."communities"."updated_at" FROM "public"."communities" WHERE 1=1 LIMIT $1 OFFSET $2
prisma:query SELECT "public"."pubs"."id", "public"."pubs"."pub_type_id" FROM "public"."pubs" WHERE ("public"."pubs"."community_id" = $1 AND "public"."pubs"."parent_id" IS NULL) OFFSET $2
prisma:query SELECT "public"."pub_types"."id", "public"."pub_types"."name", "public"."pub_types"."fields" FROM "public"."pub_types" WHERE "public"."pub_types"."id" IN ($1) OFFSET $2
prisma:query SELECT "public"."metadata"."id", "public"."metadata"."type", "public"."metadata"."value", "public"."metadata"."pub_id" FROM "public"."metadata" WHERE "public"."metadata"."pub_id" IN ($1) OFFSET $2
prisma:query SELECT "public"."pubs"."id", "public"."pubs"."pub_type_id", "public"."pubs"."parent_id" FROM "public"."pubs" WHERE "public"."pubs"."parent_id" IN ($1) OFFSET $2
prisma:query SELECT "public"."pub_types"."id", "public"."pub_types"."name", "public"."pub_types"."fields" FROM "public"."pub_types" WHERE "public"."pub_types"."id" IN ($1) OFFSET $2
prisma:query SELECT "public"."metadata"."id", "public"."metadata"."type", "public"."metadata"."value", "public"."metadata"."pub_id" FROM "public"."metadata" WHERE "public"."metadata"."pub_id" IN ($1,$2,$3) OFFSET $4

I haven't really looked into this more, but intuitively this seems problematic and this discussion seems to agree. Perhaps this is a reason to consider storing the metadata for a pub as a json column within the pubs table?

Also @gabestein I think github doesn't let you request multiple reviewers on private repos unless the owner (the pubpub org in this case, not the repo creator) has a paid plan. Assuming it's a free org, I think we've just been enjoying multiple reviewers on v6 because it's a public repo.

@gabestein gabestein merged commit 39c5b93 into main Jun 22, 2023
@gabestein gabestein deleted the techref-06-21-23 branch June 22, 2023 13:53
@gabestein
Copy link
Copy Markdown
Member Author

gabestein commented Jun 22, 2023

@3mcd: I guess I assumed that the ultimate type of a pub could be unknown at the time it's created. e.g. a biorxiv admin decides a submission is COVID-related and adds some metadata fields to its pub based on that.

Fwiw, from the user perspective this particular example isn't a type change, it's a piece of Pub metadata. Per feedback on our second moment from the retreat, users likely will want to define an object's type in advance.

@isTravis: Integrations and such will have to both check that the type can have the fields it needs, and then that they're not null to continue.

Slightly after you left yesterday, we discussed just this, but didn't come to a decision. The way I think we expressed it was that integrations and UX elements will want to know which (presumably more generic) PubTypes a given Pub could satisfy, given its metadata, even if the PubType it was given is highly bespoke.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants