Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Move CSV imports into queued jobs #3407

Closed
wants to merge 10 commits into from

Conversation

rjmackay
Copy link
Contributor

@rjmackay rjmackay commented Nov 20, 2018

This pull request makes the following changes:

  • Move csv/:id/import handler into CSVController
  • Replace generic ImportUsecase with ImportCSVUsecase since it was already specific
  • Move code from ImportListener (and a little from the controller) into ImportCSVPostsUsecase
  • Trigger ImportPosts event and then ImportPostsJob from ImportCSVUsecase
  • Add ImportPostsJob to run ImportCSVPostsUsecase

Cannot land till after #3371

Test checklist:

  • composer test
  • I certify that I ran my checklist

Ping @ushahidi/platform

@@ -18,4 +18,21 @@ protected function getResource()
{
return 'csv';
}

public function import(Request $request)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved here since this is now attached to the csv resource.

{
$usecase->setIdentifiers(['id' => $this->csvId]);

$results = $usecase->interact();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just fires off the usecase :)

use Ushahidi\Core\Tool\FileReader;
use Ushahidi\Core\Usecase\Concerns\VerifyEntityLoaded;

class ImportCSVPostsUsecase implements Usecase
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this came from Import listener, and prior to that from the old ImportUsecase

// - Provides dispatch()
use DispatchesEvents;

public function __construct(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this much more concrete because it's actually easier than using the factories

$file->fwrite($contents);

// Get records
// @todo read up to a sensible offset and process the rest later
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up step would be to batch these, similar to the Export job... but doing the simple version first

$this->repo->update($entity);

// ... dispatch an event and let other services know
$this->dispatch($entity->getResource(). '.import', [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All we do here now is update status and trigger the import event

@coveralls
Copy link

coveralls commented Nov 20, 2018

Coverage Status

Coverage increased (+0.01%) to 20.531% when pulling 181cae5 on refactor/csv-import-in-job into d24496b on develop.

{
$usecase->setIdentifiers(['id' => $this->csvId]);

$results = $usecase->interact();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just fires off the usecase :)

- Move csv/:id/import handler into CSVController
- Replace generic ImportUsecase with ImportCSVUsecase since it was already specific
- Move code from ImportListener (and a little from the controller) into ImportCSVPostsUsecase
- Trigger ImportPosts event and then ImportPostsJob from ImportCSVUsecase
- Add ImportPostsJob to run ImportCSVPostsUsecase
- Remove Import command
- Remove MappingTransformer
- Remove ddeboer/import and symfony/property-access
Copy link
Contributor

@rowasc rowasc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be OK to merge (after testing it)

refactor: Remove `./artisan import` command and MappingTransfomer
…t-in-job

Conflicts:
	app/Providers/AppServiceProvider.php
	composer.lock
	src/App/Subscriber.php
@rjmackay
Copy link
Contributor Author

I've deployed this and #3408 to sad charlie. I have a few fixes incoming but it appears to work. however because we're not batching the imports we'll have to set the queue timeout really long before we deploy this


public function handle($id, Entity $entity)
{
$userId = $this->session->getUser()->getId();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're sending this to another process we need to tell it which user to use

// load the user from the job into the 'session'
// the CSV itself doesn't save the creator so we're passing this
// in from the job queue
$this->session->setUser($this->getRequiredIdentifier('user_id'));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... I'm sort of wondering if I should do this at the job level instead now. and possibly make sure I set the user back to null after.

$userId = $this->session->getUser()->getId();

// If csv-queue feature is enabled
if (Features::isEnabled('csv-queue')) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@willdoran I completely forgot I added this. Basically this means without the feature flag this should be functionally the same as it is now: the CSV import runs synchronously in the HTTP request.


// ... persist the new entity
try {
$id = $this->postRepo->create($entity);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future I think we could collect these all into a array and send it to createMany and it should just work. We need to do it in batches to avoid keeping 10k posts in memory, but it should still dramatically speed up if you insert 500-1000 posts at a time.

@rowasc rowasc closed this Dec 9, 2019
@rowasc
Copy link
Contributor

rowasc commented Dec 9, 2019

Keeping listed elsewhere until we decide if this can be merged yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants