Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock when archiving and avoid invalidating sites that have archiving in progress #15272

Merged
merged 9 commits into from Dec 24, 2019

Conversation

diosmosis
Copy link
Member

When archiving for a site begins, we create a Lock and only release when archiving finishes or fails.

The lock's expire time is updated on each query (the TTL can be configured via DI).

There is also a change to log each archiving query (previously this was just in LogAggregator, but now it will affect all archivers).

I think we should deploy this to demo and see if it works after a day or two (after getting tests to pass of course).

Note: with this implementation it's possible for archiving to start while an invalidation is occurring, which would cause issues. I'm not sure how big of a risk it is since the invalidation queries are fairly simple. There could be a few of them though.

Fixes #15170

@diosmosis diosmosis added not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. Needs Review PRs that need a code review labels Dec 15, 2019
@diosmosis diosmosis added this to the 3.13.1 milestone Dec 15, 2019
return $idArchive;
}
} finally {
$archivingStatus->archiveFinished($lock);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi archiveFinished does not take an argument and above the archiveStarted () does not return a value

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Member

@tsteur tsteur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diosmosis left few comments. Be good in general to test this on demo for few days.

$periodDates = $this->getUniqueDates($periodDates);

$idSites = $this->removeSitesWithInProgressArchiving($idSites);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diosmosis not sure I understand why we would filter the idSites here and not just invalidate all of them? Is this because we assume they were already invalidated / will be invalidated? Would there be any harm in still invalidating them or the issue that this would cause the remember flag to be removed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder maybe we could still invalidate archives for the other idSites as well just not remove the remember flag?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We remove them because we know they are currently being archived. We also keep the idSites in the remember flags so they will eventually be invalidated when they are not being archiving.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if there could be meanwhile some other previously finished archives for those idSites that need to be invalidated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be, but because of the way the function parameters are (idSites + list of dates + period type), it's a bit harder to control exactly what to invalidate. We'd have to loop over idSites & dates and check each individually, and invalidate each archive individually instead of with one query (which is what we do now I think).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diosmosis it is very important we respect these configured values. We basically need to check if the last archive was done more than the configured time ago even if the archive is invalid (unless the archive is invalid and it is a new day I think then we don't respect these values but not sure? @mattab )

and keep the locks but don't do anything different in ArchiveInvalidator?
I'm not quite fully understand this part but be good in general to maybe still invalidate an archive in ArchiveInvalidator? Not sure of consequences of not using Locks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure of consequences of not using Locks?

It's just an easy way to check if an archive is on-going no matter what machine/process it's on. It's not really necessary, but it might be useful in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reckon be good to keep it 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just an easy way to check if an archive is on-going no matter what machine/process

@diosmosis it would be big value if it would help with this issue: #8444 -> maybe it would be fixed?

unless the archive is invalid and it is a new day I think then we don't respect these values but not sure? @mattab

correct, these INI settings are only used when the date range includes Today.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wanting to make sure we're not going to work on #8444 as part of this issue :)

$this->defaultTtl = $defaultTtl;
}

public function reexpireLock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be maybe reExpireLock? Not too important though...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it looks weird to me too. But so did reExpireLock and I couldn't choose :)

/** @var ArchivingStatus $archivingStatus */
$archivingStatus = StaticContainer::get(ArchivingStatus::class);
$archivingLock = $archivingStatus->getCurrentArchivingLock();
return new ArchivingDbAdapter(Db::getReader(), $archivingLock, $this->logger);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diosmosis is there a chance you forgot to commit this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a chance? no... I definitely forgot to commit it :) added now

@diosmosis
Copy link
Member Author

I'll add this to demo soon

@diosmosis
Copy link
Member Author

@tsteur added back old functions and added a test for the Loader method that uses them

@tsteur
Copy link
Member

tsteur commented Dec 18, 2019

@diosmosis maybe we can run this code on demo for a while?

@diosmosis
Copy link
Member Author

@tsteur it should be running on demo2

@tsteur
Copy link
Member

tsteur commented Dec 22, 2019

@diosmosis do you know of any issue on the demo? Maybe we could merge and release a new beta?

@diosmosis
Copy link
Member Author

The metrics for the last two week seem accurate, I'm not aware of any issues (though I don't get emails if demo2 has an issue).

@mattab
Copy link
Member

mattab commented Dec 23, 2019

There was no error from demo2 in the last few days. @tsteur @diosmosis can you merge this and let me know, i'll release the next beta?

@diosmosis diosmosis merged commit 42454eb into 3.x-dev Dec 24, 2019
@diosmosis diosmosis deleted the 15170-archive-invalidation-locking branch December 24, 2019 11:24
jonasgrilleres pushed a commit to 1024pix/pix-analytics that referenced this pull request Sep 22, 2020
…in progress (matomo-org#15272)

* Use a lock when archiving and do not invalidate when archiving is in progress.

* Add and fix tests + modify workflow.

* forgot to add file and remove TODO

* Remove use of argument.

* Add back min archive time processed code and start on tests for it.

* Finish new LoaderTest.

* Fix new tests.
jbuget pushed a commit to 1024pix/pix-analytics that referenced this pull request Sep 26, 2020
…in progress (matomo-org#15272)

* Use a lock when archiving and do not invalidate when archiving is in progress.

* Add and fix tests + modify workflow.

* forgot to add file and remove TODO

* Remove use of argument.

* Add back min archive time processed code and start on tests for it.

* Finish new LoaderTest.

* Fix new tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review PRs that need a code review not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong data in visitors overview over time
3 participants