Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Split up Types.hs and logically organize modules #1793

Merged
merged 39 commits into from
Apr 11, 2021

Conversation

monacoremo
Copy link
Member

@monacoremo monacoremo commented Apr 6, 2021

Edit: This PR is a first step in refactoring the codebase, see #1793 (comment)

Work in progress, pulling at the Types.hs string to see what comes out of it.

Modules graph before

Created with postgrest-hsie-graph-modules:

modules

After

Seems like it needs to get worse before it get's better:

modules

Symbols after

Created with postgrest-hsie-graph-symbols:

symbols

@wolfgangwalther wolfgangwalther self-requested a review April 8, 2021 12:10
@wolfgangwalther
Copy link
Member

wolfgangwalther commented Apr 8, 2021

Hm. I'm thinking about the best way to review this, because it's a lot of changes. I feel like going through this commit-by-commit might be the best idea. Of course this means, there should be no rewriting of the history done in here - otherwise it's confusing as hell. But you don't really do that anyway.. just don't start right now, please :D. This way, I can already start reviewing, while this is still WIP.

To review:

  • move ConnectionStatus to Main (approved)
  • split ContentTypes from Types (approved)
  • split DbStructureTypes from Types (approved)
  • split Preferences from Types (approved)
  • move LogLevel to Config from Types (approved)
  • move SCstatus from Types to Main (approved)
  • moved SqlFragment and Proxy from Types (approved)
  • moved PgVersion (approved)
  • move SourceColumn (approved)
  • split out JSPath (approved)
  • split out Headers (approved)
  • refactor remaining types (approved)
  • move remaining Types to Queries (approved)
  • style (approved)
  • tweak imports in Private (approved)
  • ApiRequests imports (approved)
  • App imports (approved)
  • DbStructureTypes exports (approved)
  • Error imports (approved)
  • JSPath exports (approved)
  • Middleware imports (approved)
  • Parsers imports (approved)
  • QueryTypes exports (approved)
  • QueryBuilder imports (approved)
  • Statements imports (approved)
  • move setConfigLocal (approved)
  • ContentTypes -> ContentType (approved)
  • DbStructure (approved)
  • move ProxyUri (approved)
  • move Preferences and SqlFragment, dissolve Private.Common (approved)
  • move JSPath and Proxy to Config (approved)
  • split CLI from Config (approved)
  • fix tests (approved)
  • move PgVersion to DbStructure (approved)
  • fix main (approved)
  • sketch part of potential target structure (approved)
  • Headers -> GucHeader (approved)
  • move setConfigLocal (approved)

@monacoremo
Copy link
Member Author

monacoremo commented Apr 8, 2021

Thanks for picking this up @wolfgangwalther ! Will not touch the history :-) There are a few blurs between the commits, and some decisions that I later changed (like splitting out PgVersions into a separate module). But I hope it's still relatively straightforward to follow!

Some notes:

  • We were quite inconsistent with the use of the Types module. Major types like ApiRequest, AppConfig etc. live in their own modules, while Types is a large mixed bag that contains all kinds of concerns. In this PR I try to put type declarations where they fit naturally. Another way would be to split all Types into Types.XXX modules.
  • There is a comment in Private.QueryFragment that all functions that generate SqlFragments should be in that module. This would make sense if we wanted to control what SqlFragments exists and how they are handled. This does not work current, as SqlFragment is just a type alias for ByteString, not an opaque type - so new SqlFragments could be sneaked in anywhere. And we string together ByteString snippets all over the place to build queries. So need to see whether it makes sense to make it an opaque type; probably we should rename Private.QueryFragment to SqlFragment
  • Private.Common should be pretty easy to dissolve
  • Config mixes two concerns, CLI and Config parsing. Suggest that we split the CLI part into a separate module
  • There is a lot happening in Main, with the complexity increased further by the conditional CPP compilations. Should probably move more into modules that can be tested.

Edit:
Also, still many wildcard imports and inconsistent aliases left. Didn't want to overload this PR.

src/PostgREST/ContentTypes.hs Outdated Show resolved Hide resolved
src/PostgREST/ContentTypes.hs Outdated Show resolved Hide resolved
@wolfgangwalther
Copy link
Member

Thanks for picking this up @wolfgangwalther ! Will not touch the history :-) There are a few blurs between the commits, and some decisions that I later changed (like splitting out PgVersions into a separate module). But I hope it's still relatively straightforward to follow!

Yes, I figured - I am kind of jumping between the individual commit and the final result to cross-check whether any of my suggestions have already been addressed in a later commit.

  • We were quite inconsistent with the use of the Types module. Major types like ApiRequest, AppConfig etc. live in their own modules, while Types is a large mixed bag that contains all kinds of concerns. In this PR I try to put type declarations where they fit naturally. Another way would be to split all Types into Types.XXX modules.

I think putting them where they belong naturally is best.

  • Config mixes two concerns, CLI and Config parsing. Suggest that we split the CLI part into a separate module

  • There is a lot happening in Main, with the complexity increased further by the conditional CPP compilations. Should probably move more into modules that can be tested.

You did some of that already, didn't you? Was left out of the App to ease my review - but I remember lots of good stuff there. Not sure how easy this would be to rebase, because Steve did some work around the db-config stuff in that area by now.

Also, still many wildcard imports and inconsistent aliases left. Didn't want to overload this PR.

Yes. I agree with the approach: Change some of them as you go by, but do it for good in a separate PR afterwards!

@monacoremo
Copy link
Member Author

You did some of that already, didn't you? Was left out of the App to ease my review - but I remember lots of good stuff there. Not sure how easy this would be to rebase, because Steve did some work around the db-config stuff in that area by now.

Yes, should not be too difficult to rebase/port, will do that in a separate PR!

@monacoremo
Copy link
Member Author

Update on our rats nest after splitting DbStructureTypes:

modules

@monacoremo
Copy link
Member Author

Current progress:

modulesnew

@steve-chavez
Copy link
Member

Sorry, a bit lost here. Is the idea to make the modules graph "less connected"?

I do see that the refactoring will make the imports clearer and the files smaller, which seems good to me.

@wolfgangwalther
Copy link
Member

Sorry, a bit lost here. Is the idea to make the modules graph "less connected"?

I do see that the refactoring will make the imports clearer and the files smaller, which seems good to me.

I'm working on summing up the idea behind this (as I understand it) and sketching out the ultimate goal ("What could it look like?"). Should be posting that in a few hours.

@wolfgangwalther
Copy link
Member

Looking at what I commented on while reviewing, so far, I realized I have kind of a target structure / goal in mind. Before going forward with the review, it might be worth it to take a step back and:

  • make sure our goals are aligned (which I think they are!)
  • give Steve the chance to at least approve the goal, so he doesn't have to go all through this, too :D
  • maybe split the way to get there in a few steps, to make the scope for this PR clear. This would probably help me in not throwing every possible idea for refactor out there, when I read through the code.

Steve:

Sorry, a bit lost here. Is the idea to make the modules graph "less connected"?

The idea is to give the code-base a better internal structure. This would make the modules graph less connected, yes - with a few clearly defined interfaces between core parts of the system. Down the road this might make it easier to add some kind of unit tests, too. Only where we need them of course, but I did already have situations where I thought that would be helpful. I think the current tests are more like integration tests that go through the whole system. The main reason to do this, of course, is to make the code better maintainable. For us and also for beginners, joining the project.

Remo:

Seems like it needs to get worse before it get's better:

The reason for this is, that we ultimately need to use nested namespaces / modules (as we already discussed in a few specific cases) and need to re-export symbols at the top level of those namespaces. Only this approach will give the proper structure while keeping the filesizes small enough to manage. We could even add a CI check to avoid adding imports "into" namespaces from "somewhere else", later on.

If we follow this idea, I think we can separate out some core functionalities in the following namespaces:

  • Request
  • QueryBuilder
  • Context (config and schema - everything that is loaded once in the main thread, but used in every request)

I came up with the following graph, created from scratch:

goal

Some of that might not work exactly like that, but the general flow is clear:

  • Main will start up, handle connections, UnixSocket, CLI interface and load the whole Context - then runs App
  • App needs to Auth, parse the Request and then build a query with QueryBuilder
  • Everything depends on Context

I see a lot of similarities in your latest graph @monacoremo, already, so I think we're on the same track here?

Other than Types.hs, we do have a few other "generic files", that need to be split up, imho:

  • Middleware.hs
  • Parsers.hs
  • Error.hs (I am not really sure about this, yet)

Before we talk about what steps we need to take and which scope this PR should have: @monacoremo is this what you have in mind, too? @steve-chavez what do you think about a structure similar to that? We will touch allll files and basically nothing will be where it was before... :D

@steve-chavez
Copy link
Member

The idea is to give the code-base a better internal structure.
I came up with the following graph, created from scratch:

Got it, the image makes it clear as day 🌞

Down the road this might make it easier to add some kind of unit tests

I would suggest trying doctests, it's much easier to understand a test when it's close to the function(rather than jumping to another file).

what do you think about a structure similar to that? We will touch allll files and basically nothing will be where it was before..

By all means go for it! I can already see that the result would be much better for maintenance.

@monacoremo
Copy link
Member Author

Very nice summary @wolfgangwalther ! As I was in deep into exploration mode here, I don't think I could have put it that clearly myself :-)

I see a lot of similarities in your latest graph @monacoremo, already, so I think we're on the same track here?

Yes, exactly! The exact structure will emerge through iteration, but that's the kind of target picture we're going for.

The reason for this is, that we ultimately need to use nested namespaces / modules (as we already discussed in a few specific cases) and need to re-export symbols at the top level of those namespaces.

Fully agree! Now that we've pulled everything apart, I'm starting to put sensible modules together, e.g. Config:
6ea525b

The challenge is essentially to design nice internal APIs. E.g. with this commit 11d7020 we separate all the query building logic from app, fully encapsulating Statements, DbRequestBuilder, QueryBuilder etc. That will be easier to reason about, and enable further simplification on the query building side.

I would suggest trying doctests, it's much easier to understand a test when it's close to the function(rather than jumping to another file).

Would love to try that! Even more important to have nice internal APIs.

maybe split the way to get there in a few steps, to make the scope for this PR clear.

Yes, we will need to merge intermediate steps. I think this PR will be mostly about pulling stuff apart, future PRs will converge on a well ordered, stratified structure step by step.

@monacoremo
Copy link
Member Author

rebased, no changes to commit history

@wolfgangwalther
Copy link
Member

maybe split the way to get there in a few steps, to make the scope for this PR clear.

Yes, we will need to merge intermediate steps. I think this PR will be mostly about pulling stuff apart, future PRs will converge on a well ordered, stratified structure step by step.

Ok, so for this PR:

  • no renaming of functions, modules or files
  • no re-exports and nested modules, yet
  • some styling and changes to import statements happen naturally, but I'll just ignore for now (will be focused in a different PR)
  • Goal 1: split up all the generic files
  • Goal 2: split up other files, aligning them for nested module namespaces later on (Let's focus on the 3 areas Context, Query and Request for now - we can see about the exact naming later)

Does that align with what you did in all the commits I haven't looked at, yet? If yes, I will review along those lines.

@monacoremo
Copy link
Member Author

monacoremo commented Apr 10, 2021

Does that align with what you did in all the commits I haven't looked at, yet? If yes, I will review along those lines.

Pretty much aligned, the reality is a bit more explorative though:

  • Priority 1: Split up Types.hs into sensible modules
    • PostgREST.ApiRequest.Preferences
    • PostgREST.Config.JSPath
    • PostgREST.ContentType
    • PostgREST.DbStructure.Identifiers
    • PostgREST.DbStructure.Proc
    • PostgREST.DbStructure.Relation
    • PostgREST.DbStructure.Table
    • PostgREST.Headers
    • PostgREST.PgVersions
    • PostgREST.Query.Types
  • Priority 2: Clean up the 'Private' modules:
    • Common: dissolved
    • ProxyUri: Submodule of Config (already 'properly' wired in for now with re-exports)
    • QueryFragment (moved as a submodules of new Query module)
  • Priority 3: Encapsulate all the query generation stuff behind one module, especially from the perspective from App. I always got confused with what Statements, DbRequestBuilder, Private.QueryBuilder etc. did - now at least they are all behind one small API . We get:
    • PostgREST.Query (new, pieces refactored from App.hs)
    • PostgREST.Query.Builder (former QueryBuilder)
    • PostgREST.Query.DbRequestBuilder
    • PostgREST.Query.SqlFragment (former Private.QueryFragments - better represents the main type it exports)
    • PostgREST.Query.Statements
    • PostgREST.Query.Types (carved out from the former Types.hs - will require further cleanup)

I suggest we leave Parsers, Middleware and Errors for a later PR.

I tried to keep renaming minmial - I don't think I touched any functions, main changes are new modules from Types, Query and s/Private.QueryFragment/Query.SqlFragment/

src/PostgREST/ContentTypes.hs Outdated Show resolved Hide resolved
src/PostgREST/PgVersions.hs Outdated Show resolved Hide resolved
src/PostgREST/DbStructure.hs Show resolved Hide resolved
src/PostgREST/Headers.hs Outdated Show resolved Hide resolved
@wolfgangwalther
Copy link
Member

Does that align with what you did in all the commits I haven't looked at, yet? If yes, I will review along those lines.

Pretty much aligned, the reality is a bit more explorative though:

Sure. It's just a question of scope - which parts should I comment on now to change for this PR and which one will we keep for later.

After looking through a few more commits, I realized you did nest a few modules already (in terms of directory structure). And as you said, also did a few of the re-exports already. Shall we do all of them in this PR, then? Or should I just ignore this for now, and we'll focus on it later?

In any case I will comment on naming the modules according to the nesting we want to have later.

@monacoremo
Copy link
Member Author

After looking through a few more commits, I realized you did nest a few modules already (in terms of directory structure). And as you said, also did a few of the re-exports already. Shall we do all of them in this PR, then? Or should I just ignore this for now, and we'll focus on it later?

Definitely a later PR, let's see what was done here as proof of concept and ignore :-)

In any case I will comment on naming the modules according to the nesting we want to have later.

Yes, sounds good!

@wolfgangwalther
Copy link
Member

wolfgangwalther commented Apr 10, 2021

  • Priority 3: Encapsulate all the query generation stuff behind one module, especially from the perspective from App. I always got confused with what Statements, DbRequestBuilder, Private.QueryBuilder etc. did - now at least they are all behind one small API . We get:

    • PostgREST.Query (new, pieces refactored from App.hs)
    • PostgREST.Query.Builder (former QueryBuilder)
    • PostgREST.Query.DbRequestBuilder
    • PostgREST.Query.SqlFragment (former Private.QueryFragments - better represents the main type it exports)
    • PostgREST.Query.Statements
    • PostgREST.Query.Types (carved out from the former Types.hs - will require further cleanup)

Ah, thanks for the summary. I feel this line of separation makes Query pretty big and Request pretty small.

I would move PostgREST.Query.DbRequestBuilder into the Request namespace. This is because it turns an ApiRequest into ReadRequest and MutateRequest. There is still no SQL involved. Yes, info about our schema, but not SQL (side-note: returningCols is just wrong in there imho - it should be somewhere in the Query namespace). For me this is still part of "understanding what the user wants to do in context of our schema" = Request - and not "translating this to SQL" = Query.

Also have a look at the theoretical graph I posted above - it has Request.Builder.ReadTree and Request.Builder.Mutation - that would be all the stuff from DbRequestBuilder.

WDYT?

@wolfgangwalther
Copy link
Member

Let me add to my comment before: In my graph I had QueryBuilder instead of Query, too. My idea was to separate the "building the query" from "running the query and parsing the results". The 2nd part I had not really put into the graph, I guess it would have stayed in App. I like how you moved that out of App, but I think we need to either:

  • split the Query children into Builder and Runner or so (fragments, statements, ... are all only related to Builder, not Runner)
  • split the whole Query namespace in QueryBuilder (along the lines of my sketch) and QueryRunner or so. I much prefer this, this would yield a structure a bit flatter and map the workflow of a request nicely into 3 main parts:
    • Request: interpreting the users wish
    • QueryBuilder: Translating the request into SQL
    • QueryRunner: Run the SQL and return the results

@wolfgangwalther
Copy link
Member

Let me add to my comment before: In my graph I had QueryBuilder instead of Query, too. My idea was to separate the "building the query" from "running the query and parsing the results". The 2nd part I had not really put into the graph, I guess it would have stayed in App. I like how you moved that out of App, but I think we need to either:

* split the `Query` **children** into `Builder` and `Runner` or so (fragments, statements, ... are all only related to Builder, not Runner)

* split the whole `Query` namespace in `QueryBuilder` (along the lines of my sketch) and `QueryRunner` or so. I much prefer this, this would yield a structure a bit flatter and map the workflow of a request nicely into 3 main parts:
  
  * `Request`: interpreting the users wish
  * `QueryBuilder`: Translating the request into SQL
  * `QueryRunner`: Run the SQL and return the results

Alternative naming suggestion, with the same contents:

  • Request
  • Query
  • Response

@monacoremo
Copy link
Member Author

Yes, that's a very good thought! DbRequestBuilder is a bit misnamed then, it is pretty independent from any DB things.

Ideally we would have only one type coming out of parsing a request, not an additional intermediary step. And we would have a clear pipeline:

parseRequest :: Context -> Wai.Request -> Either RequestError Request

(Request being some combination of ApiRequest and ReadRequest/MutateRequest; internally the processing might happen in multiple steps)

Then we build a query from it

buildQuery :: Request -> Query

(note: the request would be so thoroughly parsed that query creation cannot fail)

Then we run the query

runQuery :: Query -> Either QueryError QueryResult

Then we render a response

httpResponse :: QueryResult -> Wai.Response

So App would become something like (very roughly)

response :: Context -> Wai.Request -> Wai.Response
response = httpResponse =<< runQuery . buildQuery =<< parseRequest

We are still quite distant from that however... I'll move DbRequestBuilder into ApiRequest for now, but we'll not be able to offer a clean request API in this request, as still everything is tied up in App.hs; so App.hs will still directly access that module. Maybe, to make our intention even clearer: Let's move ApiRequest to Request also, so we have

  • Request.ApiRequest
  • Request.DbRequestBuilder
  • Request.Types (from what we carved out of Types.hs, currently Query.Types)

leaving imports as is, and we'll sort out the mess in later PRs. WDYT?

src/PostgREST/Private/QueryFragment.hs Outdated Show resolved Hide resolved
src/PostgREST/Middleware.hs Outdated Show resolved Hide resolved
@wolfgangwalther
Copy link
Member

Yes, that's a very good thought! DbRequestBuilder is a bit misnamed then, it is pretty independent from any DB things.

Ideally we would have only one type coming out of parsing a request, not an additional intermediary step. [...]
[...]
(Request being some combination of ApiRequest and ReadRequest/MutateRequest; internally the processing might happen in multiple steps)
[...]
(note: the request would be so thoroughly parsed that query creation cannot fail)
[...]
[...] Maybe, to make our intention even clearer: Let's move ApiRequest to Request also
[...]
leaving imports as is, and we'll sort out the mess in later PRs. WDYT?

This is exactly my thought.

@wolfgangwalther
Copy link
Member

wolfgangwalther commented Apr 10, 2021

Before you go on another commit-spree, I have currently reviewed everything except:

  • Refactor Statements within App.hs
  • refactor Proc statement in App
  • select functions to be carved out of App
  • carve Query out of App
  • move query-related modules

My understanding is that the last commit is about that DbRequestBuilder stuff and going to be superseded, soon.

However all the other 4 commits seem to make a nice package - and there's actual refactoring involved, not just renaming.

Since the commit history has not progressed far beyond that, yet: Do you think it makes sense to make a cut before the App refactor (move that to a different PR) and address all the review comments before refactoring App?

You should be able to to just move the fix errors and PgVersion commits in front of that, too, I think.

Edit: Maybe it's not really needed, though. I didn't really look at the App commits, yet. If those still fit in the bigger context here and are easy to review, then we don't need to do that.

@monacoremo
Copy link
Member Author

So... As said, let's make a cut at this intermediate stage.

We've now sketched out some of the intended structure with Request and Query. Response will materialize out of different pieces, especially ones carved out from App. Many of the links we currently see between Request and Query are issues to be fixed.

@wolfgangwalther
Copy link
Member

How are we going to proceed with this - should we merge this into main and then follow-up with PRs there? Or should we create another branch like mega-refactor ;) on the main repo and target all the PRs there, including this one?

I feel like the extra branch might make sense to avoid having the intermediate state on the main branch. It's a lot of changes - and probably not really in a state that's less confusing than before, yet.

WDYT?

Copy link
Member

@wolfgangwalther wolfgangwalther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not check the latest rebase regarding merge-conflicts, I'll have to trust you on that.

@monacoremo monacoremo mentioned this pull request Apr 11, 2021
27 tasks
@monacoremo
Copy link
Member Author

I feel like the extra branch might make sense to avoid having the intermediate state on the main branch. It's a lot of changes - and probably not really in a state that's less confusing than before, yet.

Yes, good question. I sketched out some ideas for next steps in #1804. I would argue for merging to main right now:

  • We currently don't have many active PRs, so disruption should be minimal
  • The intermediate structure will allow us to divide work better than we currently could (e.g. work on Query and Request separately)
  • Crafting good APIs for Request and Query might take some time/iteration, so working on a separate branch will risk that we significantly diverge, causing us additional work and slowing down development

@steve-chavez , would you be ok with merging to main?

@monacoremo monacoremo changed the title [WIP] refactor: Split up Types.hs into logical modules refactor: Split up Types.hs and logically organize modules Apr 11, 2021
@steve-chavez
Copy link
Member

@monacoremo Sure! Merge time 🚀

@monacoremo
Copy link
Member Author

@monacoremo Sure! Merge time rocket

Bye bye Types.hs 🔥

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants