-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Predicted .tif file assigns a class to zero #193
Comments
@mmann1123 I didn't intentionally close this. Do we need to reopen? |
Please reopen. It even brings additional errors now. |
@willieseun can you paste a code snippet to show us how you created your predictions? |
I think in the latest ml updates I resolved the issue of dropping one of
the prediction classes.
Missing values I'm not sure.
…On Wed, Sep 14, 2022, 10:22 AM Jordan Graesser ***@***.***> wrote:
@willieseun <https://github.com/willieseun> can you paste a code snippet
to show us how you created your predictions?
—
Reply to this email directly, view it on GitHub
<#193 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHR6VAFIP4535TSVFL5KELV6HNR3ANCNFSM6AAAAAAQEZGFME>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
import geowombat as gw psearlst = ['B1','B2','B3','B4','B5','B6','B7','B8','B9','B10'] labels = gpd.read_file('Z:\Projects\Project2022\Tif file Oct\trainingsamplesoct.shp')
|
I hope you get it. It is not rendering it well enough. |
Just to add, because I don't want to open another issue. |
@willieseun I think I am getting close to the issue. Before we make any changes, can you check if the following creates the output that you are hoping for. Note that I used import geowombat as gw
from geowombat.data import l8_224078_20200518, l8_224078_20200518_polygons
from geowombat.ml import fit, predict, fit_predict
from geowombat.core import ndarray_to_xarray
import geopandas as gpd
from sklearn_xarray.preprocessing import Featurizer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.decomposition import PCA
from sklearn.naive_bayes import GaussianNB
import matplotlib.pyplot as plt le = LabelEncoder()
# psearlst = ['B1','B2','B3','B4','B5','B6','B7','B8','B9','B10']
psearlst = ['blue', 'green', 'red', 'nir', 'swir1', 'swir2']
labels = gpd.read_file(l8_224078_20200518_polygons) #gpd.read_file('Z:\Projects\Project2022\Tif file Oct\trainingsamplesoct.shp')
# Added 1 here
labels['Classvalue'] = le.fit(labels.name).transform(labels.name) + 1
fig, ax = plt.subplots(dpi=200, figsize=(5,5))
# Resampling for faster processing/testing
with gw.config.update(ref_res=500):
with gw.open(l8_224078_20200518, resampling='bilinear', stack_dim="band", band_names=psearlst) as src:
pl = Pipeline(
[
('scaler', StandardScaler()),
('pca', PCA()),
('clf', RandomForestClassifier(n_estimators=100))
]
)
X, Xy, clf = fit(src, pl, labels, col="name")
y = predict(src, X, clf)
# Convert the numpy array to a DataArray and add the 'no data' value
y = ndarray_to_xarray(
src,
y.astype('uint8'),
band_names=['estimates'],
row_chunks=64,
col_chunks=64,
attrs={
'crs': src.crs,
'res': src.res,
'transform': src.transform,
'nodatavals': (0,)
}
)
print(y)
y.gw.imshow(robust=True, ax=ax)
y.gw.save("wom_RF.tif", overwrite=True)
plt.tight_layout(pad=1) |
Ok, Let me check... |
It is returning a type error traceback. Note: I updated to the latest release |
The save function seems to be working well though. |
Apologies, it should be |
It is still not working properly. |
The resulting tif file is looking bad. |
I think you should try using it with classes more than 10. The shapefile I am using has 12 classes. Maybe that is why. |
@willieseun Can you elaborate on what you mean by bad? Do you mean that it is still not rendering the nodata value properly, that the classified values are not correct, or that the classification does not look accurate? On the latter, if you are using the test data then I would not expect it to look good because the data being used are just test data, which are meant to show the utility of the function but not to produce a good map. The only things we are addressing with this open issue are 1) nodata rendering in the classification and 2) the classified values are correct. Can you confirm that either of these are still not as expected? If you would like us to reproduce any errors with your data then you will need to post a link to your dataset. |
Can you point out the issue? It looks like there are ~10 classes, so is this from your dataset? The pixels also look to be resampled, so did you keep the |
It does look like zeros are being displayed, so if you attempted to set them as your |
No. I changed it to 50. |
Okay, if you've modified anything then we need to see your code snippet in order to reproduce the results that you shared. |
What does 500 mean for clarification, 500m? |
|
Ok |
|
@willieseun below is a code snippet that masks the import geowombat as gw
from geowombat.data import l8_224078_20200518, l8_224078_20200518_polygons
from geowombat.ml import fit, predict
import geopandas as gpd
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
def main():
predictors = ['blue', 'green', 'red', 'nir', 'swir1', 'swir2']
labels = gpd.read_file(l8_224078_20200518_polygons)
fig, ax = plt.subplots(dpi=200, figsize=(5,5))
# Resampling for faster processing/testing
with gw.config.update(ref_res=500):
with gw.open(
l8_224078_20200518,
resampling='bilinear',
stack_dim='band',
band_names=predictors
) as src:
pl = Pipeline(
[
('scaler', StandardScaler()),
('pca', PCA()),
('clf', RandomForestClassifier(n_estimators=100))
]
)
X, Xy, clf = fit(src, pl, labels, col="name")
y = predict(src, X, clf)
y = (
y.astype('uint8')
# Coerce from numpy array to dask array (for gw.save())
# you could borrow chunks from src.gw.row_chunks, but the gw.save()
# method and rasterio require the blocks to be in intervals of 16
.chunk({'band': -1, 'y': 64, 'x': 64})
# Assign geo-attributes
.assign_attrs(**src.attrs)
# Set the 'no data' attribute
.gw.assign_nodata_attrs(0)
# Convert 'no data' values to nans
.gw.mask_nodata()
)
print(y)
y.gw.imshow(robust=True, ax=ax)
y.gw.save("wom_RF.tif", overwrite=True)
plt.tight_layout(pad=1)
plt.savefig('test.png')
if __name__ == '__main__':
main() |
Yeah sorry, that's my fault. Something went sideways. I will need some patience. |
Alright. You have all the time. |
Ok this should be resolved soon. |
@willieseun This should be resolved and the new build pushed to conda-forge as well. Please note you need to specify your missing data value (if its not in the tif profile) by setting
Closing this issue unless I hear otherwise. |
@willieseun Is this resolved? |
I am unable to update to the latest release currently. |
Do you have anaconda installed?
…On Mon, Sep 26, 2022, 6:14 AM willieseun ***@***.***> wrote:
I am unable to update to the latest release currently.
—
Reply to this email directly, view it on GitHub
<#193 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABHR6VB36F4MQ5LHNGWWYKLWAFZQ3ANCNFSM6AAAAAAQEZGFME>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
No I use pip |
@willieseun I would recommend trying out miniconda and doing an install from conda-forge. Everything should be working now sorry that took a while to resolve but I am going to close this. |
Still not working |
Can you share some data and your script? I can't replicate on this side. |
@willieseun from the original post
Issue #1: predicted classes do not match the input training classes Can you please describe what is not working? Is it still both of the issues that you raised? For predictions, unless we can replicate the issue with our test data, we need an example of your data if you are able to share somehow. |
Sorry for being so ambiguous. I meant I wasn't able to install with anaconda. |
I might be able to help interpret if you send the error from the install. In your terminal window try:
|
Thanks @mmann1123 and @willieseun, if this remains an issue, please open a new issue with your |
Thanks, I have been able to install the new version |
I tried plotting the resulting .tif file in ArcMap and first, the predicted classes were not up to the number of classes in my training data. Second, I wasn't able to build unique values for the raster and the black bands around the raster are not disappearing when I set to display nodata value with no colour.
The text was updated successfully, but these errors were encountered: