Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore compatibility and integration with spflow #15

Open
Robinlovelace opened this issue Apr 26, 2022 · 3 comments
Open

Explore compatibility and integration with spflow #15

Robinlovelace opened this issue Apr 26, 2022 · 3 comments
Labels
enhancement New feature or request

Comments

@Robinlovelace
Copy link
Owner

Robinlovelace commented Apr 26, 2022

Building on #14 how does this package link with the spflow package?

Heads-up @LukeCe, thinking that using models from functions in your package could be an input into si_predict(). Sound reasonable? Any input welcome, input could go both ways so any code/ideas in here, e.g. use of the od package that does the OD data processing, that could help your work let me know.

@LukeCe
Copy link

LukeCe commented Apr 26, 2022

Hi @Robinlovelace, I am very open to the idea of integratiing {si}/{od} and {spflow}.

The main goal of {spflow} is to implement efficient estimators for spatial econometric interaction models.
These allow to account for spatial autocorrelation in gravity models and should be computationally feasible even for large sample applications.
The geographic aspects are not in the scope of {spflow} and should be handled by other packages.

Since you raised the issue of modeling a situation where the origin and destination characteristics are distinct, I would like to point to an article I am working on with Christine Thomas https://www.tse-fr.eu/sites/default/files/TSE/documents/doc/wp/2022/wp_tse_1312.pdf.
In it, we develop the matrix from estimation of a spatial econometric interaction model for the case where the set of origins can be distinct from the set of destinations and also for the case where the OD-matrix can be sparse.
I have already implemented much of this work in the {spflow} package and plan to release an update in mid-May.

For in-sample predictions (fitted values), {spflow} already provides several methods, but for out-of-sample predictions, there are still some hurdles to overcome.
In the out-of-sample case, we have to distinguish between "simple predictions" and extrapolations, i.e. predictions for flows that come from new origins or go to new destinations.
For simple predictions, which are related to a change in the explanatory variables, the theory is clear and we should be able to implement them in the near future.
Predictors that allow extrapolation to new sites are on our research agenda, but so far we do not have a clear methodology that could be implemented quickly.

An integration with si::si_predict() (I couldn't find od::od_predict()) might look like this

  • for fitted values: pass the model as argument and rely on the native predict method.
  • for predictions: we need in addition an on-the-fly conversion of the "new data" argument - this should not be difficult to do
  • for extrapolations: it's not clear to me yet, since we first need to find the best way to take into account the new observations in the neighborhood data.

In order for {od}'s data structures to be directly usable by {spflow}, they would have to provide the following information:

  • Data on OD-pairs (distances, etc).
  • Data on origins and destinations (Population size, Income etc)
  • Neighborhood graphs for the origins and destinations.

I don't know if this is something you are considering.

@Nowosad Nowosad added the enhancement New feature or request label Apr 26, 2022
@Robinlovelace
Copy link
Owner Author

Hi Lukas, quickfire follow-up: many thanks for your detailed and positive response. It sounds like {si} and {spflow} could work well together and I look forward to trying to use models generated by your package as an argument in si_predict() or some variant of it. I think these packages could be mutually supportive, with {spflow} outstanding on modelling and {si} having the potential to support with geographic data processing. On that note I'm planning to show how the representation of OD datasets as geographic desire lines can support disaggregation and diversification of start and end locations using the 'jittering' approach outlined in this recently published paper and implemented in the Rust crate odjitter by Dustin Carlino that has simple R bindings. I mention these additional links because you clearly have plenty of experience modelling OD data and interested in your thoughts on disaggregation and other things building on these (hopefully eventually sturdy) foundations.

@LukeCe
Copy link

LukeCe commented Apr 28, 2022

Hi Robin, I also think {si} and {spflow} have great potential to complement each other.

The disaggregation + jittering approach presented in your article is a great solution to the problem of representing OD flows in road networks.
Whether such disaggregation can increase the statistical accuracy of interaction models is a question that deserves further consideration.
Since the tools we provide in {spflow} clearly aim at high efficiency in the view of large samples, they might help to find an answer.

If you want to test the package you should know that the current version of {spflow} only allows modeling of "textbook data", where origins are equal to destinations and all potential flows are observed.
I am working on an update that will remove these limitations and plan to make it available in May.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants