Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in AbsoluteClustering with geographic coordinate system #169

Closed
knaaptime opened this issue May 6, 2021 · 36 comments
Closed

error in AbsoluteClustering with geographic coordinate system #169

knaaptime opened this issue May 6, 2021 · 36 comments

Comments

@knaaptime
Copy link
Member

knaaptime commented May 6, 2021

discovered by @federicoricca

The root of the issue is that while we allow haversine distance in the case that a user's data are stored in geographic coordinates, we still need the unit's area, and we're not accounting for that at the moment.

I think the best way to resolve this is to check for geographic coordinates when the index is initialized, raise an error that it only works on projected CRS, and provide a flag for forcing it through in case we've incorrectly identified geographic CRS. Actually it might be worth considering that approach elsewhere too

Hi Eli,

Thank you for your quick reply. Yes, absolutely add re-post this as an issue, we can continue the conversation there if necessary. I emailed you mostly cause I am just starting dealing with this kind of data and I thought I was just making some trivial beginner's mistake.

Thank you for the hint, I will try this way and let you know. Since I will be measuring spatial indexes for a number of US municipalities (with groupby.apply()), do you think it would make more sense/be more precise to use gdf = gdf.to_crs(gdf.estimate_utm_crs())
on the group elements so that every municipality has its optimal projection?

Thank you,
Federico

Il giorno mar 4 mag 2021 alle ore 12:17 eli knaap <ek@knaaptime.com> ha scritto:

Hi Federico,

thanks for getting in touch with this issue (as long as it's ok with you, i plan to re-post it as an issue in the segregation repo for the sake of documentation).

Basically, what you've uncovered is an internal inconsistency in the way the index is computed, and an error in the documentation. All of this comes down to projection issues and the way distance and area is measured on the earth (and handled in the indices). The absolute clustering index requires accurate measurements for both area and distance, which basically means it must operate on a projected gdf. We tried to account for this, but were incomplete. No matter what, we need the centroid of each polygon in the gdf. If your gdf is in a geographic CRS like 4326, geopandas will raise a warning that the centroid calculation could be off (but that's not really where things are going wrong). To get the distance correct in the case that data are a geographic CRS, we allow you to use haversine distance, which should be accurate between the points. The problem is after that step, we still need an accurate measurement of the polygon's area, which will be nonsense if the gdf uses geographic coordinates (even if you've chosen haversine distance). The reason you can't reproduce the API example is because the sample code leaves out the necessary reprojection lines (the API example is correct if you reproject first)

Try this: before calculating the statistic, try reprojecting your data into an appropriate UTM system using gdf = gdf.to_crs(gdf.estimate_utm_crs())

That should get you an accurate estimate for your data but let me know if it doesnt. To fix this, i think it's perfectly acceptable that the absolutelclustering index only operate on projected data, so we could either raise an error when geographic crs is detected, or reproject internally with a warning. I'll make sure the error in the API documentation is updated as part of the refactor.

thanks again for raising this

cheers

E

-- 
  Elijah Knaap, Ph.D.Associate DirectorCenter for Geospatial SciencesUniversity of California, Riversideknaaptime.com | @knaaptime

On 4 May 2021, at 11:35, Federico Ricca wrote:

Hi Eli,

This is Federico, from a couple of issues in the segregation GitHub repo.
First of all, thank you so much for your reply to my query, I succeeded in
applying multiple measures with groupby.apply.

However, I am having some issues with the AbsoluteClustering measure, on my
own data. So I decided to try and replicate the API example, and I cannot
seem to recover the value of the API. The map comes naturally with WGS 84
coordinates, and the command does not recover the API result with either
euclidean or haversine metric (indeed, the command is returning a warning
of a possible imprecision of the centroids). I tried reprojecting to EPSG
3395, and it gives me no warning anymore but a wrong index still, actually
a negative one.

I also tried double-checking with the SpatialProximity index, and here
there might be something. The geodataframe is the same, so comes with WGS
84, and if I don't change anything and run the default SpatialProximity,
that is with euclidean metric, I get the usual warning but the correct
result of the API. If instead reproject the map to EPSG 3395 and run the
same command, I get no warning but wrong result.

Could you give me any suggestions on how to deal with the whole coordinates
projection and metric choice? There seems to be some inconsistency between
what the API suggests (project into meters and use euclidean), compared to
the results I get from running the example.

Apologies for the very long email, any help would be greatly appreciated.

Thank you,
Federico

@federicoricca
Copy link

Thank you, Eli!

I was having trouble replicating the API, even with gdf = gdf.to_crs(gdf.estimate_utm_crs()), most likely because of some inconsistencies with my environment. I am starting clean with a new environment now but I am having some troubles making it all work together, before I try replicating the API as you suggested and apply that to my data.
Do you have any suggestions regarding the best environment to make segregation/pysal and dependencies work smoothly?

Apologies if my jargon is imprecise, and thanks!

@knaaptime
Copy link
Member Author

can you share the specific error you're hitting? the estimate_utm method is new, i believe in geopandas 0.9 (thanks @snowman2!!) so you'll need to make sure you have the most recent version. That method also requires a minimum version of pyproj that i cant recall offhand, but it will tell you if you need to upgrade.

good to keep in mind, though, that the function is just a convenience. Probably the "right" thing to do is set the CRS with a bit more intention by finding an appropriate one to match your study area using a site like epsg.io. (though with that said, your use-case of looping through study areas is exactly what i was thinking of when i requested the function be added to geopandas in the first place :P)

@snowman2
Copy link

snowman2 commented May 6, 2021

@snowman2
Copy link

snowman2 commented May 6, 2021

we still need the unit's area

I would be careful using UTM projection for area: https://en.wikipedia.org/wiki/Mercator_projection

"...the Mercator projection inflates the size of objects away from the equator."

If you want the area, I would recommend using an equal area projection. EPSG:6933 is a potential candidate for an equal area projection you can use globally.

@snowman2
Copy link

snowman2 commented May 6, 2021

You might also consider using geodesic area: https://pyproj4.github.io/pyproj/stable/examples.html#geodesic-area

@knaaptime
Copy link
Member Author

thats why we tag the experts :) Thanks again.

I would be careful using UTM projection for area: https://en.wikipedia.org/wiki/Mercator_projection

and that's why you should set the CRS "intentionally". With our current code, you'll get a much closer estimate of the segregation statistic using UTM than geographic coords, but it still might not be correct.

I was just looking at the geodesic area from the first link. Looks pretty reasonable to adopt that approach under the hood here. I also think i remember @martinfleis considering an enhancement to use geodesic area under the hood for the geopandas.area method when geographic crs is detected but i think that was just a proposal and still not on the actual roadmap.

Maybe the best thing to do would be include two arguments:

distance_metric = {'euclidian', 'haversine}, area_metric = {'geodesic', 'planar'}, with both as None by default and set internally using gdf.crs.is_geographic(), but allow them to be overridden by the user, with a mismatch check or something

@martinfleis
Copy link
Member

I also think i remember @martinfleis considering an enhancement to use geodesic area under the hood for the geopandas.area method when geographic crs is detected but i think that was just a proposal and still not on the actual roadmap.

It is super inefficient at the moment compared to pygeos so it won't happen anytime soon. The best option here is to wrap S2 as sf does but that would require pygeos-like interface to S2. Reproject ;).

@federicoricca
Copy link

Thank you both for the help and suggestions. In my case, I am working at the municipality level, so I don't think the mercator distortions are much of an issue as long as they affect all areas in a municipality to a similar extent, but I might be wrong on this.

Since I am new at using geopandas and pysal/segregation, would you suggest starting clean with a new environemnt and proceed with an installation from conda-forge for both? Should I start with geopandas or any other dependency first and then pysal/segregation?

At the moment, I am working with jupyter in VSCode. I am not sure if the required packages/extensions might create some issues with geopandas and/or pysal/segregation.

@knaaptime
Copy link
Member Author

Reproject ;).

done, we'll go with strategy 1 (force a projection) and leave it to the user to ensure that the one they've chosen is reasonable

@federicoricca if you start a new environment, then install segregation from conda-forge, you'll probably be all set, because that should get the latest versions of all packages (you may need to install jupyter in that env as well since its not a dependency). VSCode and its extensions won't be a problem, but you do have to make sure that the jupyter kernel you're using in your notebooks in VSCode is the same one from the environment you've just created.

@federicoricca
Copy link

Ok so it took a bit more trouble than expected but I am up and running now. I believe there are some conflicts between jupyter and the latest geopandas (fiona in particular) so I had to follow a specific order to avoid any errors when imports. From a new environment:

conda install -c conda-forge jupyter

and necessarily following that:

conda install -c conda-forge segregation

I believe the same would work on geopandas alone. Also, it does not work by just letting the VSCode jupyter extension installed the required packages, it has to be installed manually and first, before geopandas/segregation. I am not sure why.

I will try and replicate the API examples reprojecting, as well as apply that method to my data. I will let you know asap how it goes. Thanks again everyone!

Just for reference, I am attaching the .txt of my environment finally working, geo.txt

@federicoricca
Copy link

@knaaptime, I tried to replicate the API for SpatialProximity and AbsoluteClustering, with mixed results.

For SpatialProximity, it is possible to recover the same statistic as the API by not reprojecting the gdf, which is incorrect and indeed generates warnings.

gdf.crs

< Geographic 2D CRS: EPSG:4326 >
Name: WGS 84
Axis Info [ellipsoidal]:

  • Lat[north]: Geodetic latitude (degree)
  • Lon[east]: Geodetic longitude (degree)
    Area of Use:
  • name: World.
  • bounds: (-180.0, -90.0, 180.0, 90.0)
    Datum: World Geodetic System 1984 ensemble
  • Ellipsoid: WGS 84
  • Prime Meridian: Greenwich

SpatialProximity(gdf, 'nhblk10', 'pop10').statistic

C:\Users\ricca\anaconda3\envs\geo\lib\site-packages\segregation\spatial\spatial_indexes.py:1621: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

c_lons = np.array(data.centroid.x)
C:\Users\ricca\anaconda3\envs\geo\lib\site-packages\segregation\spatial\spatial_indexes.py:1622: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

c_lats = np.array(data.centroid.y)
C:\Users\ricca\anaconda3\envs\geo\lib\site-packages\segregation\spatial\spatial_indexes.py:1643: UserWarning: Geometry is in a geographic CRS. Results from 'area' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

np.fill_diagonal(c, val = np.exp(-(alpha * data.area)**(beta)))
1.0021918830065368

Reprojecting gives no error but a slightly different result.

gdf_proj = gdf.to_crs(gdf.estimate_utm_crs())
gdf_proj.crs

< Projected CRS: EPSG:32611 >
Name: WGS 84 / UTM zone 11N
Axis Info [cartesian]:

  • E[east]: Easting (metre)
  • N[north]: Northing (metre)
    Area of Use:
  • name: Between 120°W and 114°W, northern hemisphere between equator and 84°N, onshore and offshore. Canada - Alberta; British Columbia (BC); Northwest Territories (NWT); Nunavut. Mexico. United States (USA).
  • bounds: (-120.0, 0.0, -114.0, 84.0)
    Coordinate Operation:
  • name: UTM zone 11N
  • method: Transverse Mercator
    Datum: World Geodetic System 1984 ensemble
  • Ellipsoid: WGS 84
  • Prime Meridian: Greenwich

SpatialProximity(gdf_proj, 'nhblk10', 'pop10').statistic

1.001524478777269

Things get uglier for AbsoluteClustering. Without projecting, the result gives errors and is off compared to the API. With the same projection as before, it gives no error and a negative index. In neither case it is possible to replicate the API results.

AbsoluteClustering(gdf, 'nhblk10', 'pop10').statistic

C:\Users\ricca\anaconda3\envs\geo\lib\site-packages\segregation\spatial\spatial_indexes.py:1849: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

c_lons = np.array(data.centroid.x)
C:\Users\ricca\anaconda3\envs\geo\lib\site-packages\segregation\spatial\spatial_indexes.py:1850: UserWarning: Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

c_lats = np.array(data.centroid.y)
C:\Users\ricca\anaconda3\envs\geo\lib\site-packages\segregation\spatial\spatial_indexes.py:1871: UserWarning: Geometry is in a geographic CRS. Results from 'area' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.

np.fill_diagonal(c, val = np.exp(-(alpha * data.area)**(beta)))
0.00995047795534899

AbsoluteClustering(gdf_proj, 'nhblk10', 'pop10').statistic

-0.25993772986194746

I will try and replicate the examples notebook in the repo and see if I get similar inconsistencies. Any idea what the issue might be here? Reprojecting does not seem to make it work either, the ACL should not be negative, as far as I understand.

Thanks!

@knaaptime
Copy link
Member Author

knaaptime commented May 9, 2021

I could swear I was able to reproduce the API example before i emailed you back, but i can confirm i'm definitely seeing the same results as you now. I also see this show up in the new examples for version 2.0. We'll need to take a closer look at AbsoluteClustering and RelativeClustering

Looks like SpatialProximity is behaving like it should

@renanxcortes
Copy link
Collaborator

renanxcortes commented May 9, 2021

I remember in the paper we developed (https://link.springer.com/article/10.1007/s42001-019-00059-3), Table 3 (below) compares the values of segregation, mostly, with OasisR package of R (https://cran.r-project.org/web/packages/OasisR/index.html) and I can see that at least Relative Clustering was generating the same values of the R package. (I remember building a geopandas datafame from a heterogeneous Voronoi polygons to make these comparisons as a Toy Example)

image

However, If I'm not mistaken, I developed Absolute Clustering after writing this section in the paper. Therefore, I didn't make the comparison. It would be cool to make this comparison again and compare ACL.

@knaaptime
Copy link
Member Author

i think the issue has to do with the way euclidian_distances is formatted. I think i have it resolved by using a pysal W

@knaaptime
Copy link
Member Author

@renanxcortes take a look at the new results at the bottom. This makes a small change in the clustering indices to use a libpysal weights object to calculate the distance matrix instead of using scikit

@renanxcortes
Copy link
Collaborator

@renanxcortes take a look at the new results at the bottom. This makes a small change in the clustering indices to use a libpysal weights object to calculate the distance matrix instead of using scikit

Got it! Thanks! Also, I think these changes may affect also Distance Decay Isolation/Exposure (which also uses scikit's euclidean_distances), right? In these cases, the values match OasisR's implementation, at least.

@renanxcortes
Copy link
Collaborator

(Altough, there is an issue in OasisR that I raised here: cran/OasisR@cc3681d)
Its formula doesn't match Massey and Denton's paper, but our implementation follow exactly Massey and Denton's paper)

@knaaptime
Copy link
Member Author

@renanxcortes take another look at that notebook now that the distance calculations all use libpysal.W objects. In particular, dxInteraction and dxIsolation are now essentially equivalent to their aspatial counterparts. With exponential decay on tract-level data, "nearby" observations get discounted so quickly they barely have an effect (difference is out to like the 8th decimal), though if you use an inappropriate projection like 4326 you can see the function is working like it should. My guess is Massey and Denton used raw geographic coordinates to calculate DPxx in this table which is why they get a reasonable difference from xPx (mislabelled in the table)
image

@federicoricca
Copy link

@renanxcortes take a look at the new results at the bottom. This makes a small change in the clustering indices to use a libpysal weights object to calculate the distance matrix instead of using scikit

I was having a look at your modifications to the ACL. If I understand correctly, the ACL is now using a simple inverse-square decay rather than exponential like in Massey and Denton?

Also, I was just wondering, if the DistanceBand binary is set to false, does the threshold really matter? I noticed you set it to max distance in the code.

Thanks!

@knaaptime
Copy link
Member Author

knaaptime commented May 10, 2021

ok, i spent some time with both massey/denton and the code today. I think a big part of the confusion comes from their description of the c matrix.

image

they draw an analogy to a contiguity matrix (i.e. where cells describe affinity between i an j), but they actually compute a distance matrix, where cells describe dissimilarity between i and j, so we need to use 1-c to get the proper weights.

I was having a look at your modifications to the ACL. If I understand correctly, the ACL is now using a simple inverse-square decay rather than exponential like in Massey and Denton?

in the most recent commit, i set it back to exp

Also, I was just wondering, if the DistanceBand binary is set to false, does the threshold really matter? I noticed you set it to max distance in the code.

threshold is required if binary is false

With exponential decay on tract-level data, "nearby" observations get discounted so quickly they barely have an effect (difference is out to like the 8th decimal),

this wasn't quite right, there was an error in my code earlier

i'm confident in these new numbers

@federicoricca
Copy link

With exponential decay on tract-level data, "nearby" observations get discounted so quickly they barely have an effect (difference is out to like the 8th decimal),

I was just playing around with distances and Spatial Proximities, and indeed it does look like with exponential decay the discount is just too extreme for NumPy to consider any c_{ij} different from 0. Distances are all given in meters, perhaps using kilometers instead allows for a more meaningful weighting?

@knaaptime
Copy link
Member Author

knaaptime commented May 11, 2021

c_{ij} should be a proximity matrix, not a distance matrix as it's described in the paper. OasisR also operationalizes it (incorrectly) as a distance matrix:

> df <- sf::read_sf("~/Dropbox/projects/segregation/notebooks/rside.shp")

> OasisR::RCL(as.data.frame(df)[1:2], folder ="~/Dropbox/projects/segregation/notebooks", shape = "rside")
OGR data source with driver: ESRI Shapefile 
Source: "/Users/knaaptime/Dropbox/projects/segregation/notebooks", layer: "rside"
with 453 features
It has 2 fields
        [,1]   [,2]
[1,]  0.0000 1.1255
[2,] -0.5295 0.0000

> OasisR::ACL(as.data.frame(df)[1:2], folder ="~/Dropbox/projects/segregation/notebooks", shape = "rside")
OGR data source with driver: ESRI Shapefile 
Source: "/Users/knaaptime/Dropbox/projects/segregation/notebooks", layer: "rside"
with 453 features
It has 2 fields
[1] 0.0414 0.6086

> OasisR::xPy(as.data.frame(df)[1:2])
       [,1]   [,2]
[1,] 0.1160 0.8840
[2,] 0.0557 0.9443

> OasisR::DPxy(as.data.frame(df)[1:2], folder ="~/Dropbox/projects/segregation/notebooks", shape = "rside")
OGR data source with driver: ESRI Shapefile 
Source: "/Users/knaaptime/Dropbox/projects/segregation/notebooks", layer: "rside"
with 453 features
It has 2 fields
       [,1]   [,2]
[1,] 0.1160 0.8840
[2,] 0.0557 0.9443
Warning messages:

versus our estimates using the same dataset (using the 2.0 branch)

image

notice OasisR gives a negative estimate for relative clustering, and its estimates for interaction and dxinteraction are the same

@knaaptime
Copy link
Member Author

i should also note the performance difference between us and OasisR is extreme :)

@federicoricca
Copy link

Thanks! Yes, I agree about the interpretation of C as a proximity matrix. However, my concern was mostly about the numerical limitations of exponential decay, which is inherently based on distance in 2.0 as well. The issue being that when coordinates are defined in meters, the values of distances and areas values are simply too large and are all translated into literal zeros by np.exp(-d).

I did re-read carefully that spatial section in Massey and Denton and there is one mention of_miles_. I believe considering a kilometric scale measures would give more meaningful weights. I did try the C proximity matrix from your new 2.0 SpatialProximity, and the weights assigned to nearby units are very small, even for contiguous units. I hope this helps!

@knaaptime
Copy link
Member Author

knaaptime commented May 12, 2021

agree, the exponential decay is fast (especially when you include all observations in the study area). Above we get a difference between Interaction and DxInteraction... at the third decimal.

an alternative would be to use one of the generalized spatial indices in the new 2.0 framework. Two benefits would be that you can choose exactly how you want to represent space by passing a libpysal.W, and you can see the influence of space for your chosen index, e.g. by looking at the multiscalar profile

@federicoricca
Copy link

I have been experimenting with the ACL on my own data, and it shows quite an inconsistent behavior. I think this has more to do with the ACL itself, the updated code is perfect.

Using the same exponential decay in meters, as Massey and Denton suggest, the discounting is so steep that for a lot of instances the index is = 1. By using hundreds of meters or km, the exponential decay is a bit smoother, but that allows the index to be negative when the X group is highly segregated and a lot of units display x = 0. Have you ever noticed anything similar? I am not entirely sure how they can guarantee that the ACL will belong to [0, 1). They never discuss explicitly how to deal with units with zero group population. Perhaps the ACL is thought to be used considering only units with x>0? Thank you for any suggestion/comment.

@renanxcortes
Copy link
Collaborator

c_{ij} should be a proximity matrix, not a distance matrix as it's described in the paper. OasisR also operationalizes it (incorrectly) as a distance matrix:

> df <- sf::read_sf("~/Dropbox/projects/segregation/notebooks/rside.shp")

> OasisR::RCL(as.data.frame(df)[1:2], folder ="~/Dropbox/projects/segregation/notebooks", shape = "rside")
OGR data source with driver: ESRI Shapefile 
Source: "/Users/knaaptime/Dropbox/projects/segregation/notebooks", layer: "rside"
with 453 features
It has 2 fields
        [,1]   [,2]
[1,]  0.0000 1.1255
[2,] -0.5295 0.0000

> OasisR::ACL(as.data.frame(df)[1:2], folder ="~/Dropbox/projects/segregation/notebooks", shape = "rside")
OGR data source with driver: ESRI Shapefile 
Source: "/Users/knaaptime/Dropbox/projects/segregation/notebooks", layer: "rside"
with 453 features
It has 2 fields
[1] 0.0414 0.6086

> OasisR::xPy(as.data.frame(df)[1:2])
       [,1]   [,2]
[1,] 0.1160 0.8840
[2,] 0.0557 0.9443

> OasisR::DPxy(as.data.frame(df)[1:2], folder ="~/Dropbox/projects/segregation/notebooks", shape = "rside")
OGR data source with driver: ESRI Shapefile 
Source: "/Users/knaaptime/Dropbox/projects/segregation/notebooks", layer: "rside"
with 453 features
It has 2 fields
       [,1]   [,2]
[1,] 0.1160 0.8840
[2,] 0.0557 0.9443
Warning messages:

versus our estimates using the same dataset (using the 2.0 branch)

image

notice OasisR gives a negative estimate for relative clustering, and its estimates for interaction and dxinteraction are the same

Cool! And with the new framework, the user can set any kind of "weights" that can comprised contiguity or exponential distance, etc.

@knaaptime
Copy link
Member Author

@federicoricca can you double check using the latest commit in the 2.0 branch? (sorry, i thought i pushed that a couple days ago) I included a row-standardization line that i think will make a difference for you. Here's a histogram of stats for 20 different MSAs. You can see it does skew toward 0, but im not getting any negative values

image

@sjsrey
Copy link
Member

sjsrey commented May 15, 2021

On a related note, the haversine function requires radians as arguments and I think the current implementation here and likely elsewhere is passing degrees?

@sjsrey
Copy link
Member

sjsrey commented May 15, 2021

Following up on @knaaptime investigation on the c matrix, from the White (1983) paper:

image

Also for the "own" distance d_{i,i} White has:

image

whereas in Massey and Denton they have $(.6A)^{.5}$. Not the same.

@knaaptime
Copy link
Member Author

On a related note, the haversine function requires radians as arguments and I think the current implementation here and likely elsewhere is passing degrees?

in 2.0, i've opted to remove the haversine option, because we often need area anyway. Going forward we operate exclusively on planar geoms and we warn the user otherwise.

@federicoricca
Copy link

@federicoricca can you double check using the latest commit in the 2.0 branch? (sorry, i thought i pushed that a couple days ago) I included a row-standardization line that i think will make a difference for you. Here's a histogram of stats for 20 different MSAs. You can see it does skew toward 0, but im not getting any negative values

image

Sorry about that, my bad, wasn't using the new version and found a typo. Thanks!

@federicoricca
Copy link

Following up on @knaaptime investigation on the c matrix, from the White (1983) paper:

image

Also for the "own" distance d_{i,i} White has:

image

whereas in Massey and Denton they have $(.6A)^{.5}$. Not the same.

Great catch! Also, in the paragraph above that, White (1983) clearly states the use of distances in miles, while most likely the CRS will give distances in meters, with quite a different exponential decay, provided Massey and Denton follow White in that aspect too.

@knaaptime
Copy link
Member Author

I think we can mark this resolved as of #161
?

@federicoricca
Copy link

Definitely, and thank you for all the help!

@knaaptime
Copy link
Member Author

awesome. thanks for raising the issue that led to all this digging!

resolved by #161

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants