Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to pull data in three different layers faster #17

Closed
rkang30 opened this issue Oct 24, 2017 · 5 comments
Closed

How to pull data in three different layers faster #17

rkang30 opened this issue Oct 24, 2017 · 5 comments

Comments

@rkang30
Copy link

rkang30 commented Oct 24, 2017

Hi,

I have a list of csrs with associate data that exists in different layers. But it took me over 2mins for about 50 csrs.

The values that I am pulling are below.

csr id, csr name, number of active jobs that are associated with the csr id, total job parts, and planned hours (this is sum of job plannedHours under each job with certain activity code.)

here below is my code to pull these data

	$results = [];
	$count = 0;
	foreach($csrs as $csr_id => $csr_name){

		$results[$count]['csr_id'] = $csr_id;
		$results[$count]['csr_name'] = $csr_name;

		$active_jobs = $this->paceStartSql()
		->filter('@csr', (int) $csr_id)
		->sort('@job', true)
		->find()
        ->filterKeys(function($key){
            return preg_match('/^[0-9]{6,}$/', $key);
        });
		
		$active_dockets = count($active_jobs);
		$results[$count]['active_dockets'] = $active_dockets;

		$parts = 0;
		$hours = 0;
		foreach($active_jobs as $job){
			$jobParts = $job->totalParts;
			$plannedHours = (!is_null($job->jobPlans()->filter('@activityCode', $this->prePressJobPlanCode)->first())) ? $job->jobPlans()->filter('@activityCode', $this->prePressJobPlanCode)->first()->plannedHours : 0;
			$parts += $jobParts;
			$hours += $plannedHours;
		}

		$results[$count]['total_parts'] = $parts;    		
		$results[$count]['booked_hours'] = $hours;  

	  $count++;	
	}

	return $results;   

Is there any faster way of pulling the data?

Thanks in advance,
Ryan

@rkang30
Copy link
Author

rkang30 commented Oct 25, 2017

I have played around with current api system and I was able to reduce it to a little less than 1 minute. This time I only pull first 25 csrs. The time that burns the most is at looping stage for fetching 'plannedHours'. I know that I can get the collection of keys in array. Is there any way that I can pluck the one column value (for my case 'plannedHours') after find()? What I looking for is to fetch only 'plannedHours' values in array and sum those values. This would reduce significant time.

Thanks,
Ryan

	$results = [];
	$count = 0;
	foreach($csrs as $csr_id => $csr_name){

		$results[$count]['csr_id'] = $csr_id;
		$results[$count]['csr_name'] = $csr_name;

		$activeJobIds = $this->paceStartSql()
		->filter('@csr', (int) $csr_id)
		->sort('@job', true)
		->find()
        ->filterKeys(function($key){
            return preg_match('/^[0-9]{6,}$/', $key);
        })->keys();
		
		$active_dockets = count($activeJobIds);
		$results[$count]['active_dockets'] = $active_dockets;

        $parts = 0;
        $hours = 0;

        if(count($activeJobIds) > 0){
            $jobPlans = $pace->jobPlan
            ->filter(function($xpath) use ($activeJobIds){
                $c = 0;
                foreach($activeJobIds as $activeJobId){
                    if($c == 0){
                        $xpath->filter('@job', $activeJobId);
                      $c = 1;  
                    }else{
                        $xpath->orFilter('@job', $activeJobId);
                    }
                }
            })
            ->filter('@activityCode', $this->prePressJobPlanCode)
            ->filter('@plannedHours', '!=', 0)
            ->find();

            foreach($jobPlans as $jobPlan){
                $hours += $jobPlan->plannedHours;
            }
        
            $jobParts = $pace->jobPart
            ->filter(function($xpath) use ($activeJobIds){
                $c = 0;
                foreach($activeJobIds as $activeJobId){
                    if($c == 0){
                        $xpath->filter('@job', $activeJobId);
                      $c = 1;  
                    }else{
                        $xpath->orFilter('@job', $activeJobId);
                    }
                }
            })
            ->find()
            ->count();

            $parts += $jobParts;
        }

		$results[$count]['total_parts'] = $parts;    		
		$results[$count]['booked_hours'] = $hours;  

	  $count++;	
	}

@radoslavius
Copy link

We usually do queries on data over the database, which is much much faster. Especially for inquiries through more than one object.

@robgridley wrote: "I hate the Pace database. Who ever named those columns should be flogged in the street. At least come up with a consistent naming scheme for relationship keys!" - it is big TRUE

But we are using phpstorm that make things much easier:
https://blog.jetbrains.com/phpstorm/2014/11/database-language-injection-configuration/

@rkang30
Copy link
Author

rkang30 commented Oct 25, 2017

Hi there,

I found 'pluck()' in KeyCollection class. But it didn't cut any time on fetching 'plannedHours'... I guess I should close this ticket for now..

@rkang30 rkang30 closed this as completed Oct 25, 2017
@robgridley
Copy link
Owner

It does not really matter what you do in PHP, because your PHP code is not the bottleneck. It is the repeated API read object calls which are slow. It is known as an "N+1 problem". Unfortunately, the API does not allow you to read multiple objects at once or fetch only the properties you are interested in.

I agree with @radoslavius. This is a situation where querying Postgres directly—at least for summing the planned hours—would perform a lot better.

@rkang30
Copy link
Author

rkang30 commented Oct 26, 2017

Thanks for your answer, robgridley.

Could it be possible for you to add additional API feature that allows users to run raw psgresql queries?

e.g. $job = $pace->select('SELECT * FROM jobs WHERE job = ? ', [123]);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants