Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentinel2 input example #1

Closed
Fahdben opened this issue Jan 23, 2018 · 11 comments

Comments

@Fahdben
Copy link

commented Jan 23, 2018

Hi thanks for this python package!
Would it be possible to add an example of how to prepare and load the Sentinel2 bands (inputs) into the ndarray format (without using the WMS service).

@AleksMat

This comment has been minimized.

Copy link
Member

commented Jan 23, 2018

Hello,

our sentinelhub Python package also supports obtaining original ESA data stored at AWS. This examples show how to obtain data into numpy arrays. However a disadvantage of this approach is that you can only obtain band data for entire S-2 tiles and different band images will have different spatial resolution.

The other option is to create a free trial account for the Sentinel Hub services. That will provide you with an instance ID which can be then used for running this package or for obtaining data with sentinelhub package via WMS/WCS services.

@Fahdben

This comment has been minimized.

Copy link
Author

commented Jan 23, 2018

Thanks for the reply!

@Fahdben Fahdben closed this Jan 23, 2018

@pksorensen

This comment has been minimized.

Copy link

commented Feb 11, 2018

i wanted to try with a downloaded L1C package from open access hub and created the following script:

from s2cloudless import S2PixelCloudDetector
import numpy as np
import rasterio
from rasterio.warp import reproject, Resampling



with rasterio.open("/data/test/T32VNJ_20170403T104021_B01.jp2") as scl:
    B01=scl.read()
    tmparr = np.empty_like(B01)
    aff = scl.transform
    print(B01.shape)
    
with rasterio.open("/data/test/T32VNJ_20170403T104021_B02.jp2") as scl:
    B02=scl.read()    
    reproject(
        B02, tmparr,
        src_transform = scl.transform,
        dst_transform = aff,
        src_crs = scl.crs,
        dst_crs = scl.crs,
        resampling = Resampling.bilinear)
    B02 = tmparr
    print(B02.shape)

with rasterio.open("/data/test/T32VNJ_20170403T104021_B04.jp2") as scl:
    B04=scl.read()
    reproject(
        B04, tmparr,
        src_transform = scl.transform,
        dst_transform = aff,
        src_crs = scl.crs,
        dst_crs = scl.crs,
        resampling = Resampling.bilinear)
    B04 = tmparr
    print(B04.shape)
with rasterio.open("/data/test/T32VNJ_20170403T104021_B05.jp2") as scl:
    B05=scl.read()
    reproject(
        B05, tmparr,
        src_transform = scl.transform,
        dst_transform = aff,
        src_crs = scl.crs,
        dst_crs = scl.crs,
        resampling = Resampling.bilinear)
    B05 = tmparr
    print(B05.shape)
with rasterio.open("/data/test/T32VNJ_20170403T104021_B08.jp2") as scl:
    B08=scl.read()
    reproject(
        B08, tmparr,
        src_transform = scl.transform,
        dst_transform = aff,
        src_crs = scl.crs,
        dst_crs = scl.crs,
        resampling = Resampling.bilinear)
    B08 = tmparr
    print(B08.shape)
    
with rasterio.open("/data/test/T32VNJ_20170403T104021_B8A.jp2") as scl:
    B8A=scl.read()
    reproject(
        B8A, tmparr,
        src_transform = scl.transform,
        dst_transform = aff,
        src_crs = scl.crs,
        dst_crs = scl.crs,
        resampling = Resampling.bilinear)
    B8A = tmparr
    print(B8A.shape)
with rasterio.open("/data/test/T32VNJ_20170403T104021_B09.jp2") as scl:
    B09=scl.read()
with rasterio.open("/data/test/T32VNJ_20170403T104021_B10.jp2") as scl:
    B10=scl.read()
with rasterio.open("/data/test/T32VNJ_20170403T104021_B11.jp2") as scl:
    B11=scl.read()
    reproject(
        B11, tmparr,
        src_transform = scl.transform,
        dst_transform = aff,
        src_crs = scl.crs,
        dst_crs = scl.crs,
        resampling = Resampling.bilinear)
    B11 = tmparr
    print(B11.shape)
with rasterio.open("/data/test/T32VNJ_20170403T104021_B12.jp2") as scl:
    B12=scl.read()
    reproject(
        B12, tmparr,
        src_transform = scl.transform,
        dst_transform = aff,
        src_crs = scl.crs,
        dst_crs = scl.crs,
        resampling = Resampling.bilinear)
    B12 = tmparr
    print(B12.shape)

print(B12.shape)
bands = np.array([np.dstack((B01[0]/10000.0,B02[0]/10000.0,B04[0]/10000.0,B05[0]/10000.0,B08[0]/10000.0,B8A[0]/10000.0,B09[0]/10000.0,B10[0]/10000.0,B11[0]/10000.0,B12[0]/10000.0))])
print(bands.shape)
cloud_detector = S2PixelCloudDetector(threshold=0.4, average_over=4, dilation_size=2)
cloud_probs = cloud_detector.get_cloud_probability_maps(bands)
mask = cloud_detector.get_cloud_masks(bands).astype(rasterio.uint8)

with rasterio.open("/data/test/clouds.tif", "w",  driver='GTiff',compress="lzw",height=mask.shape[1],width=mask.shape[2],count=1,dtype=rasterio.uint8) as dest:
    dest.write(mask)
with rasterio.open("/data/test/clouds_prob.tif", "w",  driver='GTiff',compress="lzw",height=cloud_probs.shape[1],width=cloud_probs.shape[2],count=1,dtype=cloud_probs.dtype) as dest:
    dest.write(cloud_probs)

I dont get the /10000.0 part , but its needed to make it work.

Is it possible to get a comment on why the values was scaled like this?

@AleksMat

This comment has been minimized.

Copy link
Member

commented Feb 12, 2018

The values you get after dividing with 10000 are called reflectance and have a theoretical meaning. They are supposed to be from interval [0, 1] (which is however not always the case).

ESA multiplies them with 10000 and stores them as integers in order to decrease the file sizes.

@pksorensen

This comment has been minimized.

Copy link

commented Feb 12, 2018

Thanks @AleksMat

Looks like my script is then working.

Do you have any recommendation on what sample size I should run this at? In the example i downsamled everything to 60m. I was not 100% impressed with the results i got. Should i try upsample everything to 10m instead?

Are there other things I should pay attention for?

I read your blog posts and wanted to ask, if you guys get better results later from more learning data. Will you then simply update the python package and, if i used this in an implementation, get the benefits of your work?

The reason i am looking at your work is that I bene using the Sen2cor and I have found it to have to many false positives on show surrounded by clouds. I was hoping to get a better cloud detection algorithm and remove those false positives.

@pksorensen

This comment has been minimized.

Copy link

commented Feb 12, 2018

one small comment to the 10000 also.

How does that relate to the bands data downloaded from L1C packages. thats uint8 between 0 and 255 values. still divide them by 10000 ?

@azupanc

This comment has been minimized.

Copy link

commented Feb 12, 2018

Do you have any recommendation on what sample size I should run this at? In the example I
downsamled everything to 60m. I was not 100% impressed with the results i got. Should i try upsample everything to 10m instead?

We have ran tests at several different resolutions and it looks like that 160m resolution has the best performance/cost (CPU sec. required during inference time) ratio. But this might be problem specific. I would suggest you to run cloud detection over the same scene at different resolutions, down- or upscale the cloud probability maps or masks to some fixed resolution (i.e. 10m) and compare the results. This way you can get better insight into how resolution impacts the final result.

Are there other things I should pay attention for?

There are 3 knobs that you can tune and impact the final cloud mask:

  • threshold: by increasing/decreasing the threshold value the cloud detection efficiency and clear sky fake rate (i.e. land being classified as clouds) go down/up. We find the default value 0.4 to work best in average, but might be different for your scenes (areas of interest).
  • average_over and dilation_size: these two parameters depend on the resolution. At 10m resolution the recommended values are 22 and 11, respectively, and at 160m resolution the recommended values are 2 and 1. These two parameters have impact on size of the buffer region around the clouds.
@azupanc

This comment has been minimized.

Copy link

commented Feb 12, 2018

I was not 100% impressed with the results I got.

Can you send us the details about your area of interest (bounding box and time interval)? Any feedback, being either positive or negative, is very valuable to us. It can help us identify scenes (or geographic areas) with specific land cover where the classification is more often wrong, which can be addressed in future trainings.

@pksorensen

This comment has been minimized.

Copy link

commented Feb 12, 2018

Sure. I will experiment abit with the resolution and when done playing I will post the result. Ids, regions and visual inspections. Possible comparison with sen2cor of the same location.

Expect a few days before I am done playing.

@azupanc

This comment has been minimized.

Copy link

commented Feb 12, 2018

I read your blog posts and wanted to ask, if you guys get better results later from more learning data. Will you then simply update the python package and, if i used this in an implementation, get the benefits of your work?

More training data would definitely help. However, as pointed in the blog post the training sample is not easy to acquire. But as pointed out above, having feedback from users might help us to identify problematic regions, which can be addressed in future trainings.

@Waasil

This comment has been minimized.

Copy link

commented Aug 10, 2018

Hello,

I have used code @pksorensen (pasted above) and I have received different layer mask then I was using WMS Requests.

I have used parameters: threshold=0.4, average_over=4, dilation_size=2
All bands has been resampled to pixel = 60m . Method of resampling: bilinear.

My algorithm produces unnecessary areas of mask. May you have any idea what is wrong?
How do you resample all bands into one size?

Bbox of my area is: [18.04779052734375, 53.875202055794965, 18.865441745827482, 54.084779979324]
Time = ('2018-03-17')

Code from example (WMS Request):
image

My code:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.