Underlying theory/covariance or correlation PCA/EOF #19

Murk89 · 2022-07-11T14:10:17Z

Hi Niclas,

After my extensive reading on the topic of PCA, EOF I was wondering whether the multivariate xeof example here, (https://xeofs.readthedocs.io/en/latest/auto_examples/1uni/plot_multivariate-eof.html#sphx-glr-auto-examples-1uni-plot-multivariate-eof-py), uses the covariance or correlation matrix?
I wanted to run a multivariate EOF for three variables at each grid box of WRF output. And my supervisor has recommended using a correlation based PCA since the variables are different. I understand that your example uses subsets of the same variable, but I am wondering if it is suitable to replace these subsets with different variables ?
Many thanks.

nicrie · 2022-07-11T14:56:02Z

The example shown uses the covariance matrix. However, you're free to choose - what you're probably looking for is the norm argument which normalizes each feature by its standard deviation I.e. computing the correlation matrix.

In general, there's no problem to use multivariate PCA with different climate variables. I agree that in this case most often you want to use the correlation matrix instead of covariance matrix. So taking the above mentioned example just use norm=True for your case.

Hope it helps!

Murk89 · 2022-07-11T18:48:37Z

Hi Niclas,

Thanks for getting back.
I have run the xeofs with multiple variables and seems to be working. Though slightly puzzled as to how to save the analysis?
I tried saving it similarly to the xmca, using mpca.save_analysis('my_analysis') and get this error: AttributeError: 'EOF' object has no attribute 'save_analysis'

nicrie · 2022-07-14T20:27:01Z

sorry for coming back so late - there's currently no method to automatically save a model (although it's not too difficult to implement).
For the moment, you have to save the individual fields on your own, e.g. assuming that you did multivariate EOF analysis using two DataArray yielding eofs1 and eofs2, you could do:

eofs1, eofs2 = pca.eofs()
pcs = pca.pcs()

eofs1.to_netcdf('eofs1.nc')
eofs2.to_netcdf('eofs2.nc')
pcs.to_netcdf('pcs.nc')

Murk89 · 2022-07-19T11:30:19Z

Hi Niclas.

So I have run into a new error now.
ValueError: Standard deviation of one ore more features is zero, normalization not possible.

My understanding regarding this error is that some grid points in my WRF temperature, rain and/or snow arrays are fully zero, due to which this error is generated. Do you have any suggestions about dealing with this?
Many thanks.

nicrie · 2022-07-19T12:54:34Z

Try removing the grid points which have zero variance, e.g. for a given DataArray da

# to check if variance is zero compare against a small number
epsilon = 1e-5  
# define the names of your spatial dimensions
spatial_dimensions = ('lat', 'lon')

valid_gridpoints = da.var('time') > epsilon
da_clean = da.stack(x=spatial_dims).sel(x=valid_gridpoints.stack(x=spatial_dims)).unstack()

note: better to keep different issues separated. Don't worry to open a new issue for each new bug/error that you encounter. It helps other people with similar issues finding the solution. :)

another note: has your initial problem in this thread been solved?

Murk89 · 2022-07-19T13:21:14Z

Hi Niclas,
Thanks for getting back. About the original issue, can't say as yet because the different errors/questions raised here are all part of the analysis.

I am also trying to understand the significance of the output. Starting a new issue for it.

nicrie · 2022-08-23T08:43:24Z

Hi Niclas, Thanks for getting back. About the original issue, can't say as yet because the different errors/questions raised here are all part of the analysis.

I am also trying to understand the significance of the output. Starting a new issue for it.

@Murk89 this is just to let you know that in the new release version 0.6.0 there is a bootstrapping class which allows you to identify automatically the number of significant modes + confidence intervals for your EOFs and PCs. You can find an example here.

Murk89 · 2022-08-23T09:17:56Z

Hi Niclas, Many thanks for updating. In the end a PCA turned out to be slightly complicated given the time left with my PhD.Hence I couldn't update this issue in terms of how it helped with my WRF output analysis.

…

On Tue, 23 Aug 2022 at 09:43, Niclas Rieger ***@***.***> wrote: Hi Niclas, Thanks for getting back. About the original issue, can't say as yet because the different errors/questions raised here are all part of the analysis. I am also trying to understand the significance of the output. Starting a new issue for it. @Murk89 <https://github.com/Murk89> this is just to let you know that in the new release version 0.6.0 there is a bootstrapping class which allows you to identify automatically the number of significant modes + confidence intervals for your EOFs and PCs. You can find an example here <https://xeofs.readthedocs.io/en/latest/auto_examples/1eof/plot_bootstrap.html#sphx-glr-auto-examples-1eof-plot-bootstrap-py> . — Reply to this email directly, view it on GitHub <#19 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOI2S4WZF73T7CVF2WNOOOLV2SFLNANCNFSM53HU44AA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- Warm regards, Murk PhD fellow in Arctic Climate studies, Grantham Centre for Sustainable Futures, University of Sheffield, U.K.

nicrie · 2022-08-23T23:02:20Z

No worries & good luck finishing your thesis!

This was referenced Aug 20, 2022

add bootstrapping for significance analysis #21

Closed

LinAlg Error during Rotation for zero communalities #20

Closed

nicrie added the question label Aug 20, 2022

nicrie closed this as completed Aug 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Underlying theory/covariance or correlation PCA/EOF #19

Underlying theory/covariance or correlation PCA/EOF #19

Murk89 commented Jul 11, 2022

nicrie commented Jul 11, 2022

Murk89 commented Jul 11, 2022

nicrie commented Jul 14, 2022

Murk89 commented Jul 19, 2022

nicrie commented Jul 19, 2022 •

edited

Murk89 commented Jul 19, 2022

nicrie commented Aug 23, 2022

Murk89 commented Aug 23, 2022 via email

nicrie commented Aug 23, 2022

Underlying theory/covariance or correlation PCA/EOF #19

Underlying theory/covariance or correlation PCA/EOF #19

Comments

Murk89 commented Jul 11, 2022

nicrie commented Jul 11, 2022

Murk89 commented Jul 11, 2022

nicrie commented Jul 14, 2022

Murk89 commented Jul 19, 2022

nicrie commented Jul 19, 2022 • edited

Murk89 commented Jul 19, 2022

nicrie commented Aug 23, 2022

Murk89 commented Aug 23, 2022 via email

nicrie commented Aug 23, 2022

nicrie commented Jul 19, 2022 •

edited