Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram data output is empty when using a for loop #1972

Closed
Remolkeman opened this issue Apr 16, 2024 · 5 comments
Closed

Histogram data output is empty when using a for loop #1972

Remolkeman opened this issue Apr 16, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@Remolkeman
Copy link

Remolkeman commented Apr 16, 2024

Environment Information

OS | Windows | CPU(s) | 16 | Machine | AMD64
Architecture | 64bit | RAM | 31.3 GiB | Environment | Jupyter
Python 3.11.9 (tags/v3.11.9:de54cf5, Apr 2 2024, 10:12:12) [MSC v.1938 64 bit (AMD64)]
geemap | 0.32.0 | ee | 0.1.398 | ipyleaflet | 0.18.2
folium | 0.16.0 | jupyterlab | 4.1.6 | notebook | 7.1.2
ipyevents | 2.0.2 | geopandas | 0.14.3 | localtileserver | 0.10.2

Description

I am trying to extract the ndvi pixel values from the histograms of multiple Sentinel-2 images and output them as a .csv file. Because there are a decent number of images, I created a for loop to iterate per image, providing the ndvi values from the histogram and saving them on csv. The issue is that the for loop outputs empty data(it actually outputs the bins but not the count); however, if it is executed line by line, it consistently outputs data. I used this example as the base of the code: #1032

What I Did

I have commented on the lines related to saving the data so that it is more comfortable to be run. I also wrote the prints to check the bins and count.

When running the loop this happends:

output_ndvi

If you run line by line:

line_by_line

I also tried to add a delay between each petition to the GEE because I thought that caused the empty data, but the result was the same.

# %%
import ee
import geemap.chart as chart
import pandas as pd

# %%
ee.Authenticate()
ee.Initialize()


# %%
def calculateNDVI(image):
    ndvi = image.normalizedDifference(["B8", "B4"])

    return ndvi.rename("NDVI")


# %%
geometry = ee.Geometry.Polygon(
    [
        [
            [-6.310153895594499, 36.89932544542902],
            [-6.30912392733278, 36.89637395041979],
            [-6.308093959071061, 36.892461327607315],
            [-6.304574900843522, 36.89294183594877],
            [-6.298480921961686, 36.894108772184815],
            [-6.293674403406999, 36.895618898477814],
            [-6.294618540980241, 36.89726627490264],
            [-6.297536784388444, 36.903855425007976],
            [-6.301399165369889, 36.90275727283513],
            [-6.305519038416764, 36.90207091970124],
            [-6.310411387659928, 36.90124728779126],
        ]
    ]
)

# %%
collection = (
    ee.ImageCollection("COPERNICUS/S2_HARMONIZED")
    .filterBounds(geometry)
    .filterDate("2015-01-01", "2022-01-15")
    .filterMetadata("CLOUDY_PIXEL_PERCENTAGE", "less_than", 10)
)

# %%
ndviCollection = collection.map(calculateNDVI)

# %%
# geemap.ee_export_image_collection_to_drive(ndviCollection, folder="GEE", scale=10)

# %%
my_sample = ndviCollection.toBands().sample(geometry, 10)

# %%
options = {
    "title": "NDVI",
    "xlabel": "NDVI",
    "ylabel": "n",
    "colors": ["#1d6b99"],
}


# %%
properties = my_sample.first().propertyNames().getInfo()

# %%
for prop in properties:
    hist_data = chart.feature_histogram(my_sample, prop, **options, show=False)
    
    print(hist_data.bins)
    print(hist_data.count)

    df = pd.DataFrame({"NDVI": hist_data.midpoints, "n": hist_data.count[1:]})

    # df.to_csv(f"{prop}.csv")

# %%
@Remolkeman Remolkeman added the bug Something isn't working label Apr 16, 2024
@giswqs
Copy link
Member

giswqs commented Apr 16, 2024

Have you tried using the zonal_stats function?
https://geemap.org/notebooks/12_zonal_statistics/

@Remolkeman
Copy link
Author

Hi!

Following your suggestion, I tried the zonal_stats function, and the following are the results

No for loop:

import ee
import geemap
import os

ee.Authenticate()
ee.Initialize()

def calculateNDVI(image):
    ndvi = image.normalizedDifference(["B8", "B4"])
    return ndvi.rename("NDVI")

geometry = ee.Geometry.Polygon(
    [
        [
            [-6.310153895594499, 36.89932544542902],
            [-6.30912392733278, 36.89637395041979],
            [-6.308093959071061, 36.892461327607315],
            [-6.304574900843522, 36.89294183594877],
            [-6.298480921961686, 36.894108772184815],
            [-6.293674403406999, 36.895618898477814],
            [-6.294618540980241, 36.89726627490264],
            [-6.297536784388444, 36.903855425007976],
            [-6.301399165369889, 36.90275727283513],
            [-6.305519038416764, 36.90207091970124],
            [-6.310411387659928, 36.90124728779126],
        ]
    ]
)

collection = (
    ee.ImageCollection("COPERNICUS/S2_HARMONIZED")
    .filterBounds(geometry)
    .filterDate("2015-01-01", "2022-01-15")
    .filterMetadata("CLOUDY_PIXEL_PERCENTAGE", "less_than", 10)
)

ndviCollection = collection.map(calculateNDVI)

geometry_feature = ee.FeatureCollection(geometry)
scale = 10

out_ndvi_stats = os.path.join(os.path.expanduser("~"), "Downloads", "ndvi_stats.csv")
geemap.zonal_stats(ndviCollection, geometry_feature, out_ndvi_stats, stat_type='MEAN', scale=scale)

geemap.create_download_link(out_ndvi_stats)


In this case, it does output a csv with the mean values properly and not empty.

"
Computing statistics ...
Generating URL ...
Downloading data from https://earthengine.googleapis.com/v1/projects/earthengine-legacy/tables/34324237bcb4dd798c95c943ae4842e8-3f0985e16b05496e4c0d4373a875a4c0:getFeatures
Please wait ...
Data downloaded to C:\Users\user\Downloads\ndvi_stats.csv
"

However, when using a for loop, it does not work properly.

import ee
import geemap
import os

ee.Authenticate()
ee.Initialize()

def calculateNDVI(image):
    ndvi = image.normalizedDifference(["B8", "B4"])
    return ndvi.rename("NDVI")

geometry = ee.Geometry.Polygon(
    [
        [
            [-6.310153895594499, 36.89932544542902],
            [-6.30912392733278, 36.89637395041979],
            [-6.308093959071061, 36.892461327607315],
            [-6.304574900843522, 36.89294183594877],
            [-6.298480921961686, 36.894108772184815],
            [-6.293674403406999, 36.895618898477814],
            [-6.294618540980241, 36.89726627490264],
            [-6.297536784388444, 36.903855425007976],
            [-6.301399165369889, 36.90275727283513],
            [-6.305519038416764, 36.90207091970124],
            [-6.310411387659928, 36.90124728779126],
        ]
    ]
)

collection = (
    ee.ImageCollection("COPERNICUS/S2_HARMONIZED")
    .filterBounds(geometry)
    .filterDate("2015-01-01", "2022-01-15")
    .filterMetadata("CLOUDY_PIXEL_PERCENTAGE", "less_than", 10)
)

ndviCollection = collection.map(calculateNDVI)


geometry_feature = ee.FeatureCollection(geometry)


scale = 10

out_dir = os.path.join(os.path.expanduser("~"), "Downloads")

if not os.path.exists(out_dir):
    os.makedirs(out_dir)

for i, image in enumerate(ndviCollection.toList(ndviCollection.size())):
    image = ee.Image(image)
    out_ndvi_stats = os.path.join(out_dir, f"ndvi_stats_{i}.csv")
    geemap.zonal_stats(image, geometry_feature, out_ndvi_stats, stat_type='MEAN', scale=scale)

"
ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))"
}
"
What I am looking for is to extract the ndvi value of each pixel in each image of the collection inside the designed geometry. As I could check, this function returns mean, std, sum, etc. It seems that something is wrong in the for loop, and I might be failing to properly program it.

@giswqs
Copy link
Member

giswqs commented Apr 23, 2024

Use collection.toBands() instead of a for loop.

import ee
import geemap
import os

ee.Authenticate()
ee.Initialize()

def calculateNDVI(image):
    ndvi = image.normalizedDifference(["B8", "B4"])
    return ndvi.rename("NDVI")

geometry = ee.Geometry.Polygon(
    [
        [
            [-6.310153895594499, 36.89932544542902],
            [-6.30912392733278, 36.89637395041979],
            [-6.308093959071061, 36.892461327607315],
            [-6.304574900843522, 36.89294183594877],
            [-6.298480921961686, 36.894108772184815],
            [-6.293674403406999, 36.895618898477814],
            [-6.294618540980241, 36.89726627490264],
            [-6.297536784388444, 36.903855425007976],
            [-6.301399165369889, 36.90275727283513],
            [-6.305519038416764, 36.90207091970124],
            [-6.310411387659928, 36.90124728779126],
        ]
    ]
)

collection = (
    ee.ImageCollection("COPERNICUS/S2_HARMONIZED")
    .filterBounds(geometry)
    .filterDate("2015-01-01", "2022-01-15")
    .filterMetadata("CLOUDY_PIXEL_PERCENTAGE", "less_than", 10)
)

ndviCollection = collection.map(calculateNDVI)


geometry_feature = ee.FeatureCollection(geometry)


scale = 10

out_dir = os.path.join(os.path.expanduser("~"), "Downloads")

if not os.path.exists(out_dir):
    os.makedirs(out_dir)

image = ndviCollection.toBands()

out_ndvi_stats = "ndvi.csv"
geemap.zonal_stats(image, geometry_feature, out_ndvi_stats, stat_type='MEAN', scale=scale)

@giswqs giswqs closed this as completed Apr 23, 2024
@Remolkeman
Copy link
Author

Remolkeman commented Apr 24, 2024

Hi Qiusheng Wu!

It is very likely that I have not properly explained the problem I am facing. To that end, I want to give you some context to help with your understanding:

I am currently doing a final master thesis, in which I am applying common remote sensing techniques such as NDVI, NDMI, SAVI..... on several delimited areas within the Guadalquivir estuary(Spain). Thus far, I have created a time series that represents the mean value of the indices. What I am currently trying to do is to use each satellite image within these time series and represent the value of each pixel for NDVI, NDMI, etc. in a histogram, and thus check the distribution of the data outside a mean. For example, imagine a Sentinel-2 image for the date 25/12/2015. In this case, I want to know the NDVI value of each pixel in that defined area, represent it in a histogram, and check the distribution of this data for that particular day of that year. In the case of my study, I would have approximately 2000+ images between the areas. Therefore, I want to apply a for loop that uses these NDVI pixel values and represent them in a histogram for each of these images and, if possible, save the NDVI(I write NDVI, but it would be for each index) pixel values in a csv. In short, I am not looking for mean values, but the pixel values.

When I tried to apply this loop, I encountered the problem reported above, returning an empty output for using the loop.

@Remolkeman
Copy link
Author

up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants