Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move zod to regular deps #387

Closed
wants to merge 1 commit into from
Closed

Move zod to regular deps #387

wants to merge 1 commit into from

Conversation

Jaspooky
Copy link

@Jaspooky Jaspooky commented Aug 20, 2022

Resolves #386

@mmkal
Copy link
Contributor

mmkal commented Aug 20, 2022

It's only used in one place: https://github.com/gajus/slonik/blob/7f578fff637ae1784dd48a6600262d0da710e341/src/sqlFragmentFactories/createIntervalSqlFragment.ts

The type there is incredibly simple, just checks that a list of properties are numbers. Maybe it could be removed as a dependency rather than added.

@gajus gajus closed this in 3bd59e0 Aug 20, 2022
@gajus
Copy link
Owner

gajus commented Aug 20, 2022

@Jaspooky fixed

@github-actions
Copy link
Contributor

🎉 This issue has been resolved in version 30.3.1 🎉

The release is available on:

Your semantic-release bot 📦🚀

@gajus
Copy link
Owner

gajus commented Aug 20, 2022

@mmkal For context, future versions of Slonik will move towards gradually adopting and eventually enforcing use of Zod.

We've migrated all queries to use sql.type and it has been one of the best decisions we could have made to decrease unexpected bugs. It is a pattern I will advocate for moving forward.

I think that in the context of SQL queries, TS types on their own are an anti-pattern – they provide a false sense of safety and can make it hard to spot bugs.

@mmkal
Copy link
Contributor

mmkal commented Aug 20, 2022

Interesting. I know you were using slonik-typegen at some point, did you find that it was producing incorrect types? While I agree that manually writing types is dangerous and gives a false sense of security, I don't see zod as a solution. Postgres already has a type system, and slonik-typegen uses it so types are always accurate. In fact, I would think that manually writing zod types for every possible query is at best a bad use of effort, and at worst actually will cause bugs, even if it hasn't yet. Here's a simple example of valid code that will throw at runtime, and manually typing zod schemas won't have any way of protecting against:

await pool.query(sql`
  create table persona(id int, name text);
  insert into person
    (id, name)
  values
    (1, 'Alice'),
    (2, 'Bob');
`)

// insert some data...

const Person = z.object({
  id: z.string(),
  name: z.string()
})

const alice = await pool.one(sql.type(Person)`
  select * from person where name = 'Alice'
`)

This will throw because id is a number, not a string. Developers are human and will make this mistake, very often. So, you'd need a tool to generate types anyway... which already exists without zod.

@gajus
Copy link
Owner

gajus commented Aug 22, 2022

@mmkal build-time type generation only provides guards at build-time. This is why it will never be a complete solution.

If you are developing just one application and have just one schema, then this is maybe not a huge problem, esp. if you can re-run type generation every time you update database schema, before deploying the changes.

However, in most organizations, databases are going to be accessed by many programs that are deployed separately, database managed by multiple teams, etc. Meaning that it is possible (and likely) that during the lifetime of an application the underlying database schema may change in such a way that the earlier generated types no longer represent the data that queries will return.

In the scenario that types are no longer accurate but runtime allows the data to flow, there is a high chance that you are either going to corrupt your data or, at the very least, start experiencing odd and hard-to-debug bugs.

We had this exact thing happen very recently. Here is a (simplified) example:

const alice = await pool.one(sql<{id: number, feature_flags: Record<string, boolean>}>`
  select id, feature_flags
  from person
  where name = 'Alice'
`);

alice.feature_flags['foo'] = false;

await pool.query(sql`
  update person
  set feature_flags = ${sql.json(alice.feature_flags)}
  where id = ${person.id}
`);

If the first query returns results matching the sql type, this code will function as expected. However, in our case, a change has been made to the database driver that caused feature_flags to be returned as a string. Nothing broke. We didn't even get errors because '{"foo":"bar"}'['foo'] is a valid code. However, our users couldn't access important features anymore, and no one knew why.

However, if there was a runtime check, this code would have thrown error immediately when the first invalid row is returned. Our engineers would immediately know that database is returning not what is expected, and we would have had a resolution shortly.

Following this incident, we started to rollout runtime checks for all queries. This surfaced a ton of cases where types did not match what queries return (and type generator would not have helped). The most common mistake was making false assumptions that a field cannot be NULL. In one case, this resulted in thousands of user profiles being marked noindex because the condition in code was written as index === true ? '' : '<meta name="robots" content="noindex">' and index was NULL for freshly created accounts.

Long story short, all of this can be easily avoided by adding runtime checks, with no real downsides. Anyone who thinks that code should continue to function even if queries return data different than what engineer specified, is going to be better off using a different client. Cannot make everyone happy.

I think the most likely outcome is going to be as follows:

  • sql will get renamed to sql.fragment – to separate query fragments from top-level queries.
  • sql alone won't be available
  • sql.type will be the expected input of all query methods
  • sql.unsafe is going to be added for development purposes

Before doing any of the above, the next step is figuring out how to make it easy to write/generate Zod schema for queries, and add helpers for migrating code, as this is a no-small undertaking.

@mmkal
Copy link
Contributor

mmkal commented Aug 22, 2022

Thank you for the detailed and informative response! That's great that zod has helped your team, but I figured it's worth sharing why I don't think it would help on teams like mine. A couple of replies inline below.

If you are developing just one application and have just one schema, then this is maybe not a huge problem, esp. if you can re-run type generation every time you update database schema, before deploying the changes.

However, in most organizations, databases are going to be accessed by many programs that are deployed separately, database managed by multiple teams, etc. Meaning that it is possible (and likely) that during the lifetime of an application the underlying database schema may change in such a way that the earlier generated types no longer represent the data that queries will return.

I don't know that this is true. It's definitely one way of doing things, and there are certain advantages. On my team we develop in a monorepo, and dependencies between teams are tracked by projects within the monorepo. With the right tooling, monorepos can scale to be arbitrarily big and complex (e.g. Google). It's not the only way of doing things, but it is a valid way of doing things. And in this case, it solves the problem you mention by making it impossible for changes like a db driver update to be deployed "under" another team without an explicit CI run - which would regenerate the types and catch the error. (Note that in your feature_flags example, you'd still have had an outage, it sounds like, it would have just been easier to debug). There are obviously tradeoffs and I'm not trying to persuade anyone to use monorepos, but in my opinion a library like slonik shouldn't have an opinion one way or another either.

Following this incident, we started to rollout runtime checks for all queries. This surfaced a ton of cases where types did not match what queries return (and type generator would not have helped). The most common mistake was making false assumptions that a field cannot be NULL. In one case, this resulted in thousands of user profiles being marked noindex because the condition in code was written as index === true ? '' : '' and index was NULL for freshly created accounts.

The type generator should handle nulls properly too, and errs on the side of caution, so it makes things nullable if it's not smart enough to be sure. There are various lint rules that can protect you against nullableThing === true type checks. On my team we prefer truthy/falsy checks rather than === ones for nullable values, so again I don't think this would have happened. I take the point, though, that there's value in using zod to assert that something is non-null, for when typegen isn't sophisticated enough to figure it out. I'd be interested to hear what some of the other cases are - these might just be a bug in typegen? In which case I'd argue the first step should be fixing in that repo rather than making changes here.

Cannot make everyone happy

Very true, and very reasonable. However, I do think it's possible to be empathetic to your users here. You've determined it makes sense for your team to require zod types for every query. However I don't think it's necessarily true for every team. Why not create a lint rule that says don't use sql` ... `?

Before doing any of the above, the next step is figuring out how to make it easy to write/generate Zod schema for queries, and add helpers for migrating code, as this is a no-small undertaking.

I'd be happy to help figure out how to make typegen have a zod-generation mode (in fact, we've actually already done this using ts-to-zod for a small subset of queries and it was pretty easy. This is also why I was pretty confident zod didn't need to be added as a hard dependency). I'd just rather not be forced to inherit your team's specific decisions. I know it's your library, so you have the right to do what you want. But I really like slonik! And I want to be able to keep using it roughly how we have been already. It's working great.

@gajus
Copy link
Owner

gajus commented Aug 22, 2022

Fair points. Thank you taking time to follow up. I will consult you before making related changes. I plan to write a few articles on the subject, first to see how the community reacts to the proposed changes. Feel free to email me if anything else in the mean time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

zod has been added as a dev dependency
3 participants