quant_api.py throws away work #303

dabreegster · 2022-01-04T10:01:32Z

I just got a many orders-of-magnitude speedup of the QUANT table lookups by removing some code that has no effect:
dabreegster@62798a5

Two questions...

What is the URN lookup at

RAMP-UA/microsim/quant_api.py

Line 97 in 04b7473

row2 = dfPrimaryZones.loc[dfPrimaryZones['zonei'] == j] #yes, zonei==j is correct, they're always called 'zonei'

supposed to do? Was this for debugging? We assign to id, but then never do anything with it, so it has no effect, except to massively slow down this routine or to throw a KeyError if something's missing (but I've never seen this happen)
Why are there several copies of the same logic? The comments and naming are slightly different, but the method has already been refactored to take in different tables previously.

Thanks, and please forgive all the beginner questions!

The text was updated successfully, but these errors were encountered:

nickmalleson · 2022-01-04T10:33:35Z

Hi Richard, cc Dustin (via github), Happy new year! I hope this finds you well. Dustin has recently started working on the RAMP project and is pouring over the existing code. Richard, who works at CASA, UCL, wrote the quant parts of the code. Dustin has found a couple of possible speed improvements to the quant code and has some questions. Richard please could you have a look at his questions below? The links to github should work, and you’re very welcome to comment on there directly. I think I will need to add you to the repository for you to comment though; if so please send me your github username and I’ll add you. Dustin’s questions: ********** I just got a many orders-of-magnitude speedup of the QUANT table lookups by removing some code that has no effect: ***@***.***<dabreegster@62798a5> Two questions... 1. What is the URN lookup at https://github.com/Urban-Analytics/RAMP-UA/blob/04b7473aed441080ee10b6f68eb8b9135dac6879/microsim/quant_api.py#L97 supposed to do? Was this for debugging? We assign to id, but then never do anything with it, so it has no effect, except to massively slow down this routine or to throw a KeyError if something's missing (but I've never seen this happen) 2. Why are there several copies of the same logic? The comments and naming are slightly different, but the method has already been refactored to take in different tables previously. Thanks, and please forgive all the beginner questions! ********** All the best, Nick

…

-- Nick Malleson Professor of Spatial Science School of Geography, University of Leeds, LS2 9JT ***@***.******@***.***> http://nickmalleson.co.uk/ 0113 34 35248 On 4 Jan 2022, at 10:01, Dustin Carlino ***@***.******@***.***>> wrote: I just got a many orders-of-magnitude speedup of the QUANT table lookups by removing some code that has no effect: ***@***.***<dabreegster@62798a5> Two questions... 1. What is the URN lookup at https://github.com/Urban-Analytics/RAMP-UA/blob/04b7473aed441080ee10b6f68eb8b9135dac6879/microsim/quant_api.py#L97 supposed to do? Was this for debugging? We assign to id, but then never do anything with it, so it has no effect, except to massively slow down this routine or to throw a KeyError if something's missing (but I've never seen this happen) 2. Why are there several copies of the same logic? The comments and naming are slightly different, but the method has already been refactored to take in different tables previously. Thanks, and please forgive all the beginner questions! — Reply to this email directly, view it on GitHub<#303>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABQLWM4IPQY2EDW22M55JDTUULAQPANCNFSM5LHBX5VQ>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

nickmalleson · 2022-01-04T11:00:36Z

Hi Nick and Dustin, The URN is the unique identifier for the school. You need it to know which destination zone (i.e. school point) you’re looking at with the probability of visit from the origin zone (zone i) to the school. The URN is guaranteed to be present, so the lookup can’t fail. As you say, it’s throwing away the URN after looking it up. This isn’t my original QUANT code. It’s a version that’s been refactored into the RAMP project and changed. It was never clear to me whether this code was actually being used in the model? All the QUANT part ever did was to generate trip probabilities from MSOA/IZ model zones to retail points, schools and hospitals via the road network. This gets used in the RAMP model to propagate the infection, but I don’t think the QUANT part ever gets used beyond the model initialisation stage? In other words, you’re not going to speed up the model by speeding up the QUANT probability lookup? I had only intended the original Python code that this is based on to be an example of how you would use the QUANT probability data. It was just there to show what format the data was in. One final question, but where can I find the code that Improbable wrote? If it’s 1000 times faster than the original Python/R project, then I’d start with that. Thanks, Richard. From: Nicolas Malleson ***@***.***> Sent: 04 January 2022 10:33 To: Milton, Richard ***@***.***> Cc: Urban-Analytics/RAMP-UA ***@***.***>; Subscribed ***@***.***>; Urban-Analytics/RAMP-UA ***@***.***> Subject: Re: [Urban-Analytics/RAMP-UA] quant_api.py throws away work (Issue #303) ⚠ Caution: External sender Hi Richard, cc Dustin (via github), Happy new year! I hope this finds you well. Dustin has recently started working on the RAMP project and is pouring over the existing code. Richard, who works at CASA, UCL, wrote the quant parts of the code. Dustin has found a couple of possible speed improvements to the quant code and has some questions. Richard please could you have a look at his questions below? The links to github should work, and you’re very welcome to comment on there directly. I think I will need to add you to the repository for you to comment though; if so please send me your github username and I’ll add you. Dustin’s questions: ********** I just got a many orders-of-magnitude speedup of the QUANT table lookups by removing some code that has no effect: ***@***.***<dabreegster@62798a5> Two questions... 1. What is the URN lookup at https://github.com/Urban-Analytics/RAMP-UA/blob/04b7473aed441080ee10b6f68eb8b9135dac6879/microsim/quant_api.py#L97 supposed to do? Was this for debugging? We assign to id, but then never do anything with it, so it has no effect, except to massively slow down this routine or to throw a KeyError if something's missing (but I've never seen this happen) 2. Why are there several copies of the same logic? The comments and naming are slightly different, but the method has already been refactored to take in different tables previously. Thanks, and please forgive all the beginner questions! ********** All the best, Nick

…

-- Nick Malleson Professor of Spatial Science School of Geography, University of Leeds, LS2 9JT ***@***.******@***.***> http://nickmalleson.co.uk/ 0113 34 35248 On 4 Jan 2022, at 10:01, Dustin Carlino ***@***.******@***.***>> wrote: I just got a many orders-of-magnitude speedup of the QUANT table lookups by removing some code that has no effect: ***@***.***<dabreegster@62798a5> Two questions... 1. What is the URN lookup at https://github.com/Urban-Analytics/RAMP-UA/blob/04b7473aed441080ee10b6f68eb8b9135dac6879/microsim/quant_api.py#L97 supposed to do? Was this for debugging? We assign to id, but then never do anything with it, so it has no effect, except to massively slow down this routine or to throw a KeyError if something's missing (but I've never seen this happen) 2. Why are there several copies of the same logic? The comments and naming are slightly different, but the method has already been refactored to take in different tables previously. Thanks, and please forgive all the beginner questions! — Reply to this email directly, view it on GitHub<#303>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABQLWM4IPQY2EDW22M55JDTUULAQPANCNFSM5LHBX5VQ>. Triage notifications on the go with GitHub Mobile for iOS<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7Crichard.milton%40ucl.ac.uk%7Cf9adf62188a346f4d0c808d9cf6da6cb%7C1faf88fea9984c5b93c9210a11d9a5c2%7C0%7C0%7C637768892123214307%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=YSUy3htADwPewDgAZ6C5SyGsHJRYjrLJkjSo4la00hI%3D&reserved=0> or Android<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7Crichard.milton%40ucl.ac.uk%7Cf9adf62188a346f4d0c808d9cf6da6cb%7C1faf88fea9984c5b93c9210a11d9a5c2%7C0%7C0%7C637768892123214307%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=26aeceWh2e%2Fc0Gube30FM33%2B0iy5ZorBTiSVlKQLxpY%3D&reserved=0>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.******@***.***>>

dabreegster · 2022-01-04T11:07:47Z

Thanks for the fast response! (I assume Richard is subscribed to the conversation here, or being forwarded emails?)

I think the URN is used earlier to set up activity locations, but there's no need to do this mapping in the middle of the QUANT table lookup. I would guess it was put in there originally for debugging, but never taken out, and has been slowing down that code for a while. :)

This gets used in the RAMP model to propagate the infection, but I don’t think the QUANT part ever gets used beyond the model initialisation stage?

That's correct based on my understanding so far. The init stage in Python can take many hours for larger regions, and one of the biggest slowdowns is here. But the speedups won't affect running the model.

One final question, but where can I find the code that Improbable wrote?

Who is Improbable? In case you meant the Rust port that I started, the QUANT code is at https://github.com/dabreegster/rampfs/blob/a5087204f484b21e5ff80246f98e8831bd67a4a5/src/quant.rs#L118.

nickmalleson · 2022-01-05T07:55:30Z

Hi both, Am forwarding this to Richard who isn't on the github repo. One question I can answer: Improbable are a software dev company who wrote the opencl version of the model (initialised the same way as the python version but then delegates processing to opencl). The source is in microsim/opencl/ramp I think (Sorry, on phone so can't link directly). Nick

…

-- Sent from a mobile, sorry about the grammar

________________________________ From: Dustin Carlino ***@***.***> Sent: Tuesday, 4 January 2022, 11:08 To: Urban-Analytics/RAMP-UA Cc: Nicolas Malleson; Comment Subject: Re: [Urban-Analytics/RAMP-UA] quant_api.py throws away work (Issue #303) Thanks for the fast response! (I assume Richard is subscribed to the conversation here, or being forwarded emails?) I think the URN is used earlier to set up activity locations, but there's no need to do this mapping in the middle of the QUANT table lookup. I would guess it was put in there originally for debugging, but never taken out, and has been slowing down that code for a while. :) This gets used in the RAMP model to propagate the infection, but I don’t think the QUANT part ever gets used beyond the model initialisation stage? That's correct based on my understanding so far. The init stage in Python can take many hours for larger regions, and one of the biggest slowdowns is here. But the speedups won't affect running the model. One final question, but where can I find the code that Improbable wrote? Who is Improbable? In case you meant the Rust port that I started, the QUANT code is at https://github.com/dabreegster/rampfs/blob/a5087204f484b21e5ff80246f98e8831bd67a4a5/src/quant.rs#L118. — Reply to this email directly, view it on GitHub<#303 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABQLWMZWN5FS4QX6CFOMEU3UULII5ANCNFSM5LHBX5VQ>. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you commented.Message ID: ***@***.***>

dabreegster · 2022-04-21T08:41:08Z

Cleaning up this old issue. If anybody using the main branch here needs to speed up their QUANT reader, just remove the dead code mentioned above

dabreegster closed this as completed Apr 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quant_api.py throws away work #303

quant_api.py throws away work #303

dabreegster commented Jan 4, 2022

nickmalleson commented Jan 4, 2022 via email

nickmalleson commented Jan 4, 2022 via email

dabreegster commented Jan 4, 2022

nickmalleson commented Jan 5, 2022 via email

dabreegster commented Apr 21, 2022

quant_api.py throws away work #303

quant_api.py throws away work #303

Comments

dabreegster commented Jan 4, 2022

nickmalleson commented Jan 4, 2022 via email

nickmalleson commented Jan 4, 2022 via email

dabreegster commented Jan 4, 2022

nickmalleson commented Jan 5, 2022 via email

dabreegster commented Apr 21, 2022