Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Network.Moran method #595

Merged
merged 17 commits into from Jan 29, 2021
Merged

add Network.Moran method #595

merged 17 commits into from Jan 29, 2021

Conversation

jGaboardi
Copy link
Member

@jGaboardi jGaboardi commented Jan 25, 2021

This PR partially addresses #586 by...

  • adding a Network.Moran method to the code base that was previously only found in the Tutorials;
  • add tests for the Network.Moran method;
  • rerunning notebooks.

@jGaboardi jGaboardi self-assigned this Jan 25, 2021
@jGaboardi
Copy link
Member Author

@slumnitz @knaaptime @jeffcsauer @sjsrey

I am having a tough time diagnosing this failure. Any thoughts/advice/etc. would be super.

@jGaboardi
Copy link
Member Author

y values [1263c8b] (observation counts along arcs) and dict sorting [9f02ea1] don't appear to be the problem. Maybe it's something in numpy or scipy.stats that causing this down from within esda... but then again CI is still passing for esda. I'm at a loss.

@jeffcsauer
Copy link

Could this be a ubuntu specific issue? As you noted, it's passing on macOS and Windows. I can try to spin up a ubuntu vm in a bit and see what's going on

@jGaboardi
Copy link
Member Author

Could this be a ubuntu specific issue? As you noted, it's passing on macOS and Windows. I can try to spin up a ubuntu vm in a bit and see what's going on

Yeah, I would be surprised at this point if it wasn't an Ubuntu specific issue.

@jGaboardi
Copy link
Member Author

got an Ubuntu machine up. here is a quick gist "demonstrating" incorrect result on Ubuntu.

@jeffcsauer
Copy link

jeffcsauer commented Jan 27, 2021

@jGaboardi have not fully diagnosed the issue, but identified at least one place within the Moran.py function where calculations start to diverge. Specifically:

# From moran.py
 def __calc(self, z):
        zl = slag(self.w, z)
        print(zl[0]) # compare values

First few numbers from Windows:

0.05280528052805272
11.471452145214515
0.5528052805280528
91.44042904290428

First few numbers from Ubuntu:

0.05280528052805272
9.136413641364133
-0.19719471947194722
29.333388338833892

True divergence in the function might start earlier, that was just the most crucial calculation that jumped out at me as likely being different (lucky guess?).

Here are my watermarks:

Windows:

Last updated: 2021-01-26T21:54:05.283616-06:00

Python implementation: CPython
Python version       : 3.8.3
IPython version      : 7.16.1

Compiler    : MSC v.1924 64 bit (AMD64)
OS          : Windows
Release     : 10
Machine     : AMD64
Processor   : Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
CPU cores   : 8
Architecture: 64bit

Watermark: 2.1.0

esda     : 2.3.6
libpysal : 4.3.0
numpy    : 1.19.0
scipy    : 1.5.1
spaghetti: 1.5.0

Ubuntu:

Last updated: 2021-01-26T14:56:44.334798-06:00

Python implementation: CPython
Python version       : 3.8.7
IPython version      : 7.19.0

Compiler    : GCC 5.4.0 20160609
OS          : Linux
Release     : 4.4.0-19041-Microsoft
Machine     : x86_64
Processor   : x86_64
CPU cores   : 8
Architecture: 64bit


Watermark: 2.1.0

esda     : 2.3.6
numpy    : 1.19.5
scipy    : 1.6.0
spaghetti: 1.5.6
libpysal : 4.3.5

I'll keep looking into it.

@jGaboardi
Copy link
Member Author

jGaboardi commented Jan 27, 2021

@jeffcsauer This is a nice find and we have something more concrete to focus on. Here is a rehash of our convo on gitter for the sake of openness:

... it looks like it may be happening in spatial_lag.lag_spatial() (aliased as slag) where w.sprase is present, which is itself being built from weights._build_sparse() and uses scipy.sparse.crs_matrix().

@jGaboardi
Copy link
Member Author

jGaboardi commented Jan 27, 2021

Here is a workflow in macOS and ubuntu demonstrating the same difference in the result for esda.Moran.I for the empirical data, but identical results for small, synthetic data.

EDIT
After looking at the y variable (counts per arc) again, the one created on Ubuntu has two different elements than that created on macOS/Windos (see cell 7 in both gists above). When substituting the counts values from macOS on Ubuntu the esda.Moran.I value is correct. This does appear to be sorting or precision issue, but it is most definitely not a bug in esda.

Things to look at:

  • Are the counts (Network.count_per_link) in macOS vs. Ubuntu?
  • Are the w.neighors objects the same segments in macOS vs. Ubuntu?
  • Are observations being snapped to the same segments in macOS vs. Ubuntu?
    • Network.pointpatterns["key"].obs_to_arc

Solved in #599.

@jGaboardi jGaboardi mentioned this pull request Jan 28, 2021
2 tasks
@codecov
Copy link

codecov bot commented Jan 29, 2021

Codecov Report

Merging #595 (a762b94) into main (9c0487b) will increase coverage by 0.0%.
The diff coverage is 100.0%.

Impacted file tree graph

@@          Coverage Diff          @@
##            main    #595   +/-   ##
=====================================
  Coverage   97.6%   97.7%           
=====================================
  Files          3       3           
  Lines       1144    1154   +10     
  Branches     286     288    +2     
=====================================
+ Hits        1117    1127   +10     
  Misses         6       6           
  Partials      21      21           
Impacted Files Coverage Δ
spaghetti/network.py 97.5% <100.0%> (+<0.1%) ⬆️

@jGaboardi jGaboardi merged commit e38fea7 into pysal:main Jan 29, 2021
@jGaboardi jGaboardi deleted the network_moran branch January 29, 2021 15:52
@jGaboardi jGaboardi moved this from In progress to Done in JOSS paper Jan 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
JOSS paper
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants