Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to tell logidze about a new column #232

Open
miharekar opened this issue Mar 10, 2023 · 8 comments
Open

How to tell logidze about a new column #232

miharekar opened this issue Mar 10, 2023 · 8 comments

Comments

@miharekar
Copy link
Contributor

So I have an interesting issue. I've added an integer column featured_photo that's null by default to a model that has logidze. And when I set the column for the first time, logidze enters it as expected:

{
  "c"=>{"updated_at"=>"2023-03-10 15:38:37.605265", "featured_photo"=>39186},
  "m"=>{"_r"=>2},
  "v"=>9,
  "ts"=>1678462717605
}

But when I want to check the value at any version, it always returns the current version one even when it was nil in previous versions:

irb(main):027:0> cafe.at(version: 8).featured_photo
=> 39186
irb(main):028:0> cafe.at(version: 9).featured_photo
=> 39186
irb(main):029:0> cafe.at(version: 1).featured_photo
=> 39186

I looked through the source code and the hammer fix is to add

object_at.restore_attributes(object_at.attribute_names - ["log_data"])

on line 251 here before we start restoring attributes.

But I'm not sure if that'll have some other non-desired consequences.

Ideally on new column creation I would be able to add new version to all models with that column being nil or whatever the default value is. But that's not there, so to go back to the issue title: what would be a "correct" way of letting logidze know about a new column?

@adas172002
Copy link

Have you used Tracking only selected columns —except=/ —only= option in your model?
Otherwise looks a lot like my case #230

@miharekar
Copy link
Contributor Author

@adas172002 I did not - I track everything. And even if I save the model again with a change on a different column Logidze/PG doesn’t detect a change in column because the column didn’t exist before and is null now.

@palkan
Copy link
Owner

palkan commented Mar 29, 2023

But when I want to check the value at any version, it always returns the current version one even when it was nil in previous versions:

I think, it can be considered a bug.

However, I don't have a quick solution in mind. The way Logidze travels back is by creating an aggregated diff (from the initial version to the specified) and applying to the current state of the record.

If we can assume that the initial version is full snapshot, we're can nullify all the attributes not present in the diff. But I'm not sure about the assumption—does it always hold? Not really: if we filter columns, we shouldn't nullify the ones we do not track.

So, the solution could be:

  • Nullify all attributes present in the current schema and missing in the diff.
  • In case not all columns are being tracked, we must only take into account trackable columns in the schema. We can try to read this information from Postgres.

Does anyone see any other edge cases?

@miharekar
Copy link
Contributor Author

Sounds reasonable, yes 👌

@jerometremblay
Copy link

If the default value of the new column is not null, I think it should not be nulled either

@palkan
Copy link
Owner

palkan commented Mar 30, 2023

If the default value of the new column is not null, I think it should not be nulled either

That's a good point; we can restore it to the default value; however, it's not always possible to obtain this information from Active Record, so, I guess, in this case we should still use nil.

@andychongyz
Copy link

What if in Lodigze::History#changes_to, instead of making a reduction to accumulate the attributes from the log changes until the version or time. It will also check for missing attributes by checking the @attributes and continue to accumulate until all missing attributes are found.

Then we will get all the attributes that are closest to the version or time we requested, instead of the latest data.

@jcsrb
Copy link
Contributor

jcsrb commented Jul 18, 2023

I did run into a similar situation. In my case, there were both tracked and untracked column.
To make it work properly, I patched up the history with nil values for the newly added fields .

Something like

Model.find_each(batch_size: 50) do |subject|
  subject.with_lock do
    raw_history = subject.read_attribute_before_type_cast(:log_data)
    parsed = JSON.parse(raw_history)
        
    next if parsed["h"][0]["c"].key?("new_field")      
      parsed["h"][0]["c"]["new_field"] = nil
      altered_history = parsed.to_json

      Logidze.without_logging do
        subject.update_column(:log_data, altered_history)
      end      
    end 
  end
end

note here that I always take snapshot right after creation, so I know version 1 should have nil for the new_field

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants