Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple columns of time series as input for prediction #77

Open
rsignell-usgs opened this issue Jan 8, 2020 · 7 comments
Open

Comments

@rsignell-usgs
Copy link
Contributor

rsignell-usgs commented Jan 8, 2020

We have a use case where we want to analyze the tides at each grid cell in a numerical model, so we have lots of time series with the same time base. How hard would it be to allow a matrix (or list of arrays) as input data to the solver?

@efiring
Copy link
Collaborator

efiring commented Jan 8, 2020

The present code is heavily based on working with a single series at a time, and it is doing all sorts of things that you would not actually want for your application. Rather than trying to work more vectorization into the present code, I think that what is needed is a separate function specifically designed for the case where you specify the set of constituents, generate the pseudo-inverse of the model matrix, and then apply it to the array of time series.

@rsignell-usgs
Copy link
Contributor Author

So I could just do a least squares fit to the tidal constituents at all grid cells and then do the astronomical adjustments based on a single time series using utide. Is that what you are thinking @efiring?

@efiring
Copy link
Collaborator

efiring commented Jan 10, 2020

Not sure what you mean by "astronomical adjustments". What I have in mind is factoring out the calculation of the model matrix, "B", which is roughly lines 229-277 in _solve.py. Then write a completely new main entry function, e.g., "solve_vectorized", that would take the minimum arguments required for the special case where the constituents to use are specified, there are no missing values in the time array and the (U,V) or H array, and the U, V, H arrays can have more than one dimension, one of which is time. Nothing fancy, no confidence intervals. The lstsq function can handle multiple right-hand sides as a 2-D array; I haven't looked closely, but I think it is vectorized at the C level, so it should be fast. An alternative would be to use something like linalg.pinv2 to get the pseudo-inverse (once), and then matrix-multiply.

@DanCodiga
Copy link

If I understand what you've asked for Rich, it is something I included in the Matlab UTIde functions. It's explained in the tech report. May not be the solution you need but I thought I'd just mention it.

@rsignell-usgs
Copy link
Contributor Author

rsignell-usgs commented Mar 9, 2021

Thanks @DanCodiga , would you have any bandwidth to contribute this here? It would be a huge benefit to the modeling community!

@efiring
Copy link
Collaborator

efiring commented Mar 9, 2021

@rsignell-usgs Please clarify: is what I described above what you are looking for? What do you mean by "astronomical adjustments"?

@DanCodiga
Copy link

Thanks @DanCodiga , would you have any bandwidth to contribute this here? It would be a huge benefit to the modeling community!

I do aspire to get back on to some tidal analysis work but unfortunately it's looking like it won't be for at least another few months or longer. Would like to update the Matlab version of UTide-- and also to help flesh out the python version so it has all (or at least most) of the features in the Matlab version (unless that's too disruptive to what the community has built... really appreciate what everybody has done here!).

As to this specific question, now. I put the option to pass in an array of time series in to the Matlab version. However I also recall that it is basically a wrapper which implements convenience looping, rather than a vectorized approach. Something vectorized, like Eric has suggested, would be more powerful but also could be a tricky chunk of effort. Not sure this is helping much, but it's my $0.02 for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants