Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bookshelf Next - New Project Organisation, call for developers and ideas #1600

Closed
Playrom opened this issue Jul 18, 2017 · 94 comments
Closed
Assignees

Comments

@Playrom
Copy link
Contributor

Playrom commented Jul 18, 2017

Hello Developers!

Last month the project creator and main maintainer @tgriesser has granted administrative power and ownership of bookshelf to a new organisation, which will oversee the future of our beloved ORM.

Reference: #1511


Me and @dj-hedgehog are some of the members of this organisation.

We want to restart the development of bookshelf, and we have already started the work with the merge of new Pull Requests, and the release of version 0.10.4

We are also analysing the issues backlog

.

But we need your help to thrive the project, in form of collaborations, new ideas, PRs etc…

If you want to help please follow this guidelines

  1. Use this thread to discuss about the future of the project or to propose yourself as developer
  2. Open a new issue for every idea you might want to be implemented in the future
  3. Submit a pull request

In the scope of reduce the huge size of the backlog we will close all the PRs and Issues older then December 2016, asking to reopen them if issues are still relevant, and PRs are still valid.

We will use the Github Projects to organise our work

We can't work alone, because this is why bookshelf started to fail this winter, we need your help and experties.

Thank You for your love for bookshelf and for every advice and help you will give us

Playrom added a commit that referenced this issue Jul 18, 2017
we added a temporary link in the README to direct all developers to the discussione about the future of bookshelf.js. It will be removed in a few weeks
Playrom added a commit to Playrom/bookshelf that referenced this issue Jul 18, 2017
commit c001e8a
Author: Giorgio Romano <Playrom@users.noreply.github.com>
Date:   Tue Jul 18 12:49:33 2017 +0200

    Link to the bookshelf#1600 discussion in README

    we added a temporary link in the README to direct all developers to the discussione about the future of bookshelf.js. It will be removed in a few weeks

commit 36cf1b1
Author: Michael Borst <mborst@users.noreply.github.com>
Date:   Tue Jul 18 12:01:47 2017 +0200

    Allow passing of time via options (bookshelf#1592)

    * Fix indent in timestamp method

    * Allow passing of time via options

commit cd9e9d2
Author: kirrg001 <katharina.irrgang@googlemail.com>
Date:   Mon Jul 17 17:01:05 2017 +0200

    Release 0.10.4
@ricardograca
Copy link
Member

I just have one (probably unpopular) suggestion and that is to drop babel. Since node versions 0.10 and 0.12 are no longer supported is it really still necessary?

@Playrom
Copy link
Contributor Author

Playrom commented Jul 19, 2017

@ricardograca open an issue about that, I'm open to a discussion about this point ( I hate babel and no-es6 javascript )

@rmharrison
Copy link
Contributor

@TJKoury: On a related note to JSON schema support, we use jsonapi-mapper for serialisation.

ricardograca added a commit that referenced this issue Jan 5, 2018
@ricardograca
Copy link
Member

ricardograca commented Jan 9, 2018

I just wanted to update this with my personal opinion on how things should move forward, and provide an update to this.

Short term

First, I'm in the process of merging or closing all open PRs. It doesn't make sense to let them gather dust like that. However, some of them contain breaking changes so caution is needed.

I just released a new version of Bookshelf that contains most of the PRs that were safe to merge: mainly documentation and small bug fixes. The rest will be taken care of soon and most should be handled by the next release. This will be a somewhat breaking release because we'll finally add the engines key to require Node 4 or newer. People still using older versions will have to continue using version 0.12.x of Bookshelf. I just created a new 0.12 branch for dealing with any possible updates to this version, but it will generally be unsupported once the new one is out.

Some PRs can't be merged in their current form, even if they contain valuable changes, because they target an older version of Bookshelf that didn't use babel and/or their authors don't want to update them to address some concerns that were raised in the mean time. These will have to be recreated again.

After all PRs are dealt with, or when only unmergeable PRs are left, the time will come to get rid of babel. See #1606 for the reasoning and discussion around that change.

Long term

That was the short term plan. The longer term plan is to finally take Bookshelf in the direction outlined in #552, along with some of the ideas expressed in #1661. That is, to convert it from the sort of active record type ORM that it is now to a data mapper ORM.

In the middle of all this, the test suite should also be overhauled because it has a few issues right now. The biggest, in my opinion, is that it doesn't have any kind of isolation between individual tests. Many depend on each other, the working mock dataset isn't reset between each individual test and they must all be run in a specific order. That makes any refactoring or addition of new tests harder than it needs to be, as well as making the tests' validity questionable.

General Guidelines

We should also decide on a few guidelines to help steer development and decision making. These are the ones that I can think of:

  • Data Mapper: always thrive to achieve separation between the models and the underlying database.
  • Don't depend on the query builder so much: this is implied in the first one, but to explain a little better, any functionality that can only be achieved by using the query builder and that comes up a lot should probably be converted to a built-in process. Ideally there shouldn't be the need to access the query builder at all.
  • Don't require a database: also implied in the first one, but is a concern that has come up a few times before. What this means is that all database specific code and any abstractions required to make something work in all supported database engines will be handled by Knex.
  • The simpler the better: if there are two solutions to a problem, go with the one that is clearly the simpler and less invasive, even if it isn't as complete as the other one.
  • Prefer more methods that do smaller individual things than less with too many configuration options each.
  • Have only one way to do things.
  • Don't shock people too much: that is, don't try to change too much at the same time. Small gradual changes are the way to go.

The second last one is especially concerning to me, because e.g. there are a lot of ways of fetching a model from the database currently, and all these options come at a cost of user confusion, inconsistent behavior, results and documentation.

@maxnordlund
Copy link

Here goes my two cents. I really would like to be able to use ES6 classes, which I believe is already the plan but worth mentioning again. Related to that is implementing some of the newer protocols, iterator/async iterator and awaitable (which is really just thenable).

Now that a major redesign is in progress I think it would be prudent to look at some of the other ORM:s out there. not just in Node, but also in general. On top of my head I'm thinking of SQLAlchemy and Djangos ORM in Python, Active Record and Data Mapper in Ruby, and finally Ecto in Elixir. (Of course I've read the plan to move away from Active Record toward Data Mapper, but still it's good to get some inspiration).

I'm really looking forward to see where this is going, as its not everyday an established ORM takes time to do such an major overhaul.

@davidfurlong
Copy link
Contributor

davidfurlong commented Jan 9, 2018

I really would like to be able to use ES6 classes, which I believe is already the plan but worth mentioning again

I think that's already the case: class User extends Bookshelf.Model {

awaitable (which is really just thenable)

I think this is also the case as it's just Promises (depending on node version obviously).


(We use both above in production)

@esatterwhite
Copy link

I would like to second @maxnordlund thoughts. python has some surprisingly good ORM / data mapper tools.

these are honestly a joy to work with

These query apis make virtually anything possible.

The thing that has always been frustrating with ORMs in node is that writing SQL and mapping relations is usually a requirement rather than a low level feature. And if that is the requirement, I don't really want to or get much benefit in using that ORM.

@ricardograca
Copy link
Member

I'm really looking forward to see where this is going, as its not everyday an established ORM takes time to do such an major overhaul.

Well, the plan is to do it as gradually as possible, instead of trying to change everything from one version to the next. Some already tried that before and failed. It's just too big of an undertaking to do in one go.

Thanks for the suggestions on good ORMs to check out.

@mrhwick
Copy link
Member

mrhwick commented Jan 10, 2018

I would love to see bookshelf move in the direction of mimicking the patterns of the Django ORM. It is an excellent example of separating the data model from the storage solution. When working with that ORM, I almost never need to think about the database underneath the modeling layer.

I think we can progressively approach that goal. my next milestone is to get bookshelf's testing story to a breakthrough moment where it is concerning itself with only the layer of abstraction which it provides. All the edge cases of the various database dialects and features... leave that work to knex, the database connector.

@maxnordlund
Copy link

TL;DR All abstractions leak, and it's important to always provide a safe escape hatch for those funky queries.

I use Django at work and have found it confusing and a bit frustrating to work with. It's some small things, like avoiding SQLy method names, e.g. where, to Pythonic ones, e.g. filter. Speaking of which, filter actually represents both WHERE and HAVING, which makes it kinda confusing to use. Same goes for group by vs annotate, where the abstraction gets in the way.

What I'm trying to get at, is the more abstracted the ORM gets, the harder it becomes to debug but it also makes it easy to do the common things like getting a single row/object via primary key. You end up needing to know just how the ORM transforms your method calls into SQL. At least for me when writing group by queries using Django I start with the SQL and work my way backwards.

This is also the argument of SQLy vs JavaScripty, where I personally like the SQL variant more. That also forces the developer to get to know how the database works, which helps debugging and profiling bad queries.

On the other hand I do love the convenient short cuts, like Post.get(), and post.save(). Especially if the latter can take an arbitrary object graph and insert/update as needed while doing it in the correct order to ensure foreign keys are setup properly. (Sounds harder then it is, you can use DFS/BFS to bucket the objects per model/table and then DFS reverse foreign key relations to automatically ensure correct dependency order.) Even better if it uses batch insert, and/or insert on conflict update.

That, and join handling which quickly becomes tedious IMO, with preloading of relations to avoid N+1 quires. I'm sure there are more stuff an ORM helps with that a raw SQL library won't do, but if it can be done by the query builder, why reimplement it?

@ricardograca
Copy link
Member

...if it can be done by the query builder, why reimplement it?

Well, in that case it seems like you just want to use a query builder. So, it's a matter of preference really. Do you want to build the queries yourself or have some higher level abstraction?

I bet some would argue why you'd need a query builder at all when you can just use the specific database adapter.

If we depend too much on having to know how the database works and how to generate correct SQL then there's no point in Bookshelf at all. It's job is to precisely abstract the database layer, so if it's not doing that it has problems. If that's the case, then you're right and you're better off just using a query builder.

@chamini2
Copy link
Contributor

chamini2 commented Jan 10, 2018

@maxnordlund, I also like SQLy abstractions more than JSy asbtractions, you are ultimately querying a DB, so why hide it?

I recommend using Objection.js for this point of view. Where I work, we are currently migrating our backend from Bookshelf to Objection.js and for our case and preferences (we like SQL a lot), is perfect.

@esatterwhite
Copy link

esatterwhite commented Jan 10, 2018

When I encounter ORMs that look and feel like SQL or make me write a lot of sql to fill in the gap, I really rather just write the query in sql and drop the ORM all together.

It's much easier and performant to just a pass a parmaterized string and send it to the driver. It'll be much faster than any amount of abstraction. It's easy to debug because you can actually see the query. The more I use things like knex, the less useful I find it.

const SQL = 'SELECT some sql  FROM table WHERE complexity > $1'
postgres.query({
  text: SQL
, values: [ 10 ]
}, cb)

call it a day.
so I'm on the other side of the fence. If I'm going to use an ORM-y abstraction, I'd prefer abstractions the fit in with the language I'm coding in. Otherwise I'd rather just write the SQL.

@maxnordlund
Copy link

Hm, I think I wasn't very clear on what I meant. I also wasn't very nice, and I apologize for that, expressing one opinion shouldn't feel so full of vitriol.

I love when the ORM helps me write some of the tricker queries, like JOIN's, using the knowledge provided by the developer. This also applies for loading related models in an performant way, i.e. using either JOIN(s) or SELECT * FROM table WHERE id in (...).

I'm very grateful when the ORM helps me save an object graph in the correct order, setting foreign keys as it goes, using bulk INSERT's and UPDATE's where necessary etc.

Then there's validation, eager or lazy, but having a structured way of validating an object is so helpful. Is this field required? Is it inside of this range? ... This can then be used for displaying errors to the user in some sort of <form>.

At the same I want to be able to drop down to writing quires by hand, using the builder or just raw strings. Some ORM's make this difficult, which can be frustrating to work with. Other makes it a breeze, and it makes you happy. For example, Django allows for adding extra fields from GROUP BY queries to your models, which allows stuff like counting related models.

Speaking about query builders, I prefer them over raw SQL since they help you avoid injection attacks. They are also reusable, which makes it easy to append WHERE conditions to some long JOIN chain.


I recommend using Objection.js for this point of view. Where I work, we are currently migrating our backend from Bookshelf to Objection.js and for our case and preferences (we like SQL a lot), is perfect.

@chamini2 actually for my hobby project I did end up using that. But I think there're a lot of potential in Bookshelf and hope to use it in the future.

@dantaub
Copy link

dantaub commented Feb 13, 2018

What I would really like to see is core or plugin support for the interface layer between the backend and frontend. I understand that many people use Backbone on the frontend since the APIs are compatible, but I see no reason why Bookshelf models (absent the DB connections) couldn't be used directly, assuming one follows best practices regarding attribute whitelists, like those available in the 'visibility' plugin.

@ricardograca
Copy link
Member

...interface layer between the backend and frontend.

Do you mean to provide a plugin or do whatever changes are necessary in core to allow Bookshelf to be used on the client-side?

@dantaub
Copy link

dantaub commented Feb 13, 2018

That's a good start -- it would be nice to have a stripped-down build of Bookshelf with minimal dependencies that we could include client side. Coupled with a caching system, that could be pretty powerful. I had been thinking just to create a parallel build process that creates classes or objects from Bookshelf models for frontend inclusion. Another compatible design is a plugin that, given some model, reduces the boilerplate needed to create Node CRUD routes that obey permissions set on the reqs (this could build on the ideas in the advanced-serialization plugin, or depend on some middleware to set the permissions).

@ansem78
Copy link

ansem78 commented Feb 24, 2018

Bookshelf is really a great ORM IMHO, but some features should be heavily revised. The main thing that should be improved a lot is the documentation. Many points are not clear and sometimes contradictory. The examples are too simple for a real-life development and are sometimes incomplete and/or confused. Collections should be removed and treated as simple arrays and all options passed to Model.fetch() (withRelated, columns, etc.) should also be valid for all models in a collection (fetchAll() does not accept these options and forces to implement collections). When retriving a model, its relations should be accessible when the model is sent to the client as a JSON object (now I have to manually set up a property in the model using Model.related() before sending it to the client). For example, if I retrieve a relation by calling a role() method on a user model, the user object should have a "role" property with the role object.

@ricardograca ricardograca changed the title Bookshelf Next - New Project Organisation, Version 0.11, Call for developers and ideas Bookshelf Next - New Project Organisation, call for developers and ideas Feb 25, 2018
@ricardograca
Copy link
Member

ricardograca commented Feb 25, 2018

@ansem78 If you have any specific issues with the documentation it would help if you opened issues explaining those problems.

Agreed about dropping collections. That is one of the medium term goals: #799.

About relations not being present when converting to JSON I'm certain that already happens. If you do:

new Customer({id: 1}).fetch({withRelated: 'settings'}).then(function(model) {
  console.log(model.toJSON())
})

you'll get the expected customer with a sub-object with the related settings. Maybe you have the visibility plugin active and that is conflicting with your expectations?

@ansem78
Copy link

ansem78 commented Feb 26, 2018

Hmm...I have not considered to convert the model to JSON before the method returns it. I am a beginner with Bookshelf and now I have set up the Express API routes to return JSON with response.json(model). In this way, the response JSON data have not any relation set. I read that calling toJSON() will convert the model with all its relations. But I thought the response.json() Express method acts the same way. I will try to convert model to JSON before sending the response. Thanks for the suggest.

For the docs, just for example, I read: "The format method is used to modify the current state of the model before it is persisted to the database". But the next paragraph says: "Do note that format is used to modify the state of the model when accessing the database". This is contradictory. The first statement is correct, but the second implicitly says that also a reading operation calls format, that is obviously false.

@davidfurlong
Copy link
Contributor

davidfurlong commented Feb 27, 2018

@ansem78 Yep response.json() will call the toJSON method of the model.

I think you are expecting this code below to add the relation to the user:

const u = await User.where({ id: 1 }).fetch();
await u.load(['role']) // needs to be assigned to u in order to have reference to the loaded relation

There isn't one set of loaded data => There can be two Models representing the same database row which have some different attributes and different loaded relations. Which can be quite confusing.

@davidfurlong
Copy link
Contributor

davidfurlong commented Feb 27, 2018

  • I agree that collections are confusing and I would see them removed.
  • I also think the docs could do with a lot of work (I would be willing to help)
  • I think transaction handling with a transaction object with const trx = await Bookshelf.transaction(); and trx.commit() is better than the current paradigm (or at least offer this as an option) ~ although this seems to currently be possible with just slightly messier syntax: const trx = new Promise(); const t; Bookshelf.transaction(y => t = y; trx); // use t, trx.resolve() to commit.

@ricardograca
Copy link
Member

ricardograca commented Feb 27, 2018

This is contradictory. The first statement is correct, but the second implicitly says that also a reading operation calls format, that is obviously false.

Well, it's not implicit. It actually says that format will influence fetch() operations, which it does. Take a look at #668. Why do you say it's false? Can you provide a reproducible code sample where removing id attributes in the format method doesn't influence fetch() or eager loading of relations?

@ansem78
Copy link

ansem78 commented Feb 27, 2018

@ricardograca I have tested parse() and format() with a console.log(). When I perform a reading operation (Model.fetch()) parse() is called, but format() is not. I have never removed any attribute in the format() method, so now I understand what you say. format() will influence fetch() operation by removing attributes, but is this behavior correct and/or really useful? I mean, if I read data, I expect Bookshelf will internally call parse() to format attributes (and only the attributes I read: all or a subset using the option "columns"). If I want to remove one or more attributes, I can either read only the attributes I want, or use Model.set() with the "unset" option to remove unwanted attributes. So, format() can be used as a "shortcut" to manipulate the model attributes in a reading operation and also affects the saved model, if I have understand.
Maybe I am wrong and don't think to all possible scenarios, but I find this a bit confusing. I think the simplest patterns to read and write models would be:

Read: DB -> dbRawData.parse() (fetch()) -> Model

Write: jsObject -> Model.format() -> DB

with parse() and format() unable to add/remove attributes. But, as I said, I am a beginner with Bookshelf and ORMs and I have surely not consider all possibile cases.

@ricardograca
Copy link
Member

Please continue that discussion in #668. I've reopened it. Sorry I linked the wrong issue in my previous comment. I've edited it to the correct issue.

@Saeris
Copy link

Saeris commented Mar 10, 2018

Just wanted to chime in here with something to take into consideration. It might be worth spending some time to evaluate modern use cases for ORMs, case in point: using Bookshelf in conjunction with GraphQL.

I started off one of my GraphQL projects using Bookshelf in my resolvers for queries and mutations. This worked fine up until I had to start doing more complex queries that involved deep nesting of types, leading to an N+1 bottleneck. I've since started using join-monster to better handle this situation. Another issue I've had some trouble with was performing upserts, where I may be performing an update mutation on a given type to either modify a few columns on an existing row, or creating a new one entirely assuming there isn't a match on the primary keys for that row. For this I'm using bookshe;f-modelbase.

I'm not entire sure what changes should be made to better support my use-case. It's more than likely I'm just not using Bookshelf correctly to begin with or I just have some fundamental misunderstanding or SQL/ORMs. Both are more than likely, given that I only started doing more full-stack development a year ago, having formerly been exclusively front-end focused. I just know I've spent a long time being confused over Bookshelf's documentation.

What I do understand pretty well is how GraphQL is changing the API landscape. At it's core, it's designed to help reduce the number of network round-trips between client and server down to a minimum, which it's very good at. This, however, does shift a lot of the overhead to the back-end, where now it's much easier to create a bottleneck between the API layer and back-end data sources. Ideally, Bookshelf would do as much optimization under the hood as possible so the database doesn't get hammered with lots of queries. That's what's beyond my area of expertise.

I can try to contribute as much as I can to discussing that particular topic.

Other than that, +1 on using modern syntax features such as async/await.

Might be worth considering using Rollup to compile to different module formats and picking up Typescript/Flow. Looks like there's some hesitance to use Babel in this project. I'd say go all-in with it, because with a combination of modern code bundling tools and babel-preset-env it should be easy to polyfill support for older versions of Node.

@jdrelick-va
Copy link

@Saeris I know you posted a while ago but I just thought I'd recommend taking a look at using dataloaders to solve your GraphQL n+1 issues. It is an approach designed to solve exactly that while also adding some useful caching capabilities. Dataloaders are especially useful in a db-per-service microservice architecture where joins are simply not possible.

@ricardograca
Copy link
Member

Closing this since the discussion already received a lot of useful contributions and 0.14.0 was just released together with a new documentation website redesign.

If you want a new feature or you have a bug to report use the issue tracker for filing specific issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests