Skip to content

Upsert, many_to_many and cast_assoc: insert in join table generated even when the upsert turns out to be an update #4725

@tanguilp

Description

@tanguilp

Elixir version

1.19.5

Database and Version

Postgres 16.1

Ecto Versions

3.13.5

Database Adapter and Versions (postgrex, myxql, etc)

postgrex

Current behavior

Let's assume I have 3 tables country, city and country_cities (the join table).

I insert and update the countries and there associated cities using cast_assoc on a many_to_many association from the Country schema using an upsert.

For example this is the params of the changeset:

%{
  "cities" => %{
    "0" => %{
      "_persistent_id" => "0",
      "id" => "4215",
      "name" => "Lyon modified"
    },
    "1" => %{
      "_persistent_id" => "1",
      "id" => "2456",
      "name" => "Grenoble"
    }
  },
  "cities_drop" => [""],
  "cities_sort" => ["0", "1"],
  "name" => "France",
}

We then apply these params to the existing Country and get a changeset passed to Repo.insert/....

It doesn't work and the transaction is rolled back because a insert in the join table is generated, even though the Country exists and the association (in this case cities) is loaded:

[debug] QUERY OK source="country" db=0.3ms
INSERT INTO "country" ("id","name","updated_at","inserted_at") VALUES ($1,$2,$3,$4) ON CONFLICT ("id") DO UPDATE SET "updated_at" = EXCLUDED."updated_at","inserted_at" = EXCLUDED."inserted_at","traffic" = EXCLUDED."traffic","name" = EXCLUDED."name" RETURNING "updated_at","inserted_at","name","id" [1267, "France",  ~U[2026-05-12 17:54:37.577948Z], ~U[2026-04-28 18:07:49.000000Z]]

[debug] QUERY OK source="city" db=0.3ms
UPDATE "city" SET "name" = $1, "updated_at" = $2 WHERE "id" = $3 ["Lyon modified", ~U[2026-05-12 17:54:37.579648Z], 4215]

[debug] QUERY ERROR source="country_cities" db=0.3ms
INSERT INTO "country_cities" ("country_id","city_id","inserted_at","updated_at") VALUES ($1,$2,$3,$4) [1267, 4215, ~N[2026-05-12 17:54:37], ~N[2026-05-12 17:54:37]]

As you can see the last statement is an insert in the join table.

It is generated because:

  1. in
    changeset = put_repo_and_action(changeset, :insert, repo, tuplet)
    the action is set to :insert
  2. then in
    defp insert_join?(%{action: :insert}, _, _field, _related_key), do: true
    Ecto determines it's a new, fresh insert and therefore a association record has to be added in the join table

Expected behavior

First it remains to be determined if it is a bug.

In my opinion it is, because the changeset contains all the required information (preloaded assoc, ids in the changeset generated by cast_assoc) to determine whether we should insert in the join table or not. Moreover, when using upserts, we except it to be an insert or an update.

I think this bug is a reminiscence of Ecto before upserts were implemented. In this case it made sense to assume that a Repo.insert/... could only be an insert, but this no longer the case since. Maybe just removing this line would make sense, because as far as I know there is no generic way to know if the result of an upsert is a new record or an existing one.

PS: the example is trivial and doesn't represent what I'm really doing, and I don't really have the option to edit cities separately. I'm building a collaborative subsystem based on Ecto changesets. In a nutshell: changes are proposed by the user, then params are stored in the DB and later applied to the current stored object.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions