Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Issues running the batch_to_eopatch pipeline #13

Closed
mlubej opened this issue Feb 15, 2022 · 5 comments
Closed

[BUG] Issues running the batch_to_eopatch pipeline #13

mlubej opened this issue Feb 15, 2022 · 5 comments
Labels
bug Something isn't working

Comments

@mlubej
Copy link
Contributor

mlubej commented Feb 15, 2022

Question

I have successfully run the batch download pipeline and would like to convert the batch tiles to eopatches. After locally fixing #12 I've managed to run the batch_to_eopatch pipeline, but I get the following exception in the logs:

Summary of exceptions

    LoadUserDataTask (LoadUserDataTask-29825b248e7b11ecbc3b-f57730fc0853):
        14 times:

        TypeError: execute() missing 1 required positional argument: 'eopatch'

Which is weird, because the LoadUserDataTask is the first Task and no eopatch arguments should be expected.

Here is my config:

{
  "pipeline": "eogrow.pipelines.batch_to_eopatch.BatchToEOPatchPipeline",
  "folder_key": "data",
  "mapping": [
    {"batch_files": ["B01.tif"], "feature_type": "data", "feature_name": "B01", "multiply_factor": 1e-4},
    {"batch_files": ["B02.tif"], "feature_type": "data", "feature_name": "B02", "multiply_factor": 1e-4},
    {"batch_files": ["B03.tif"], "feature_type": "data", "feature_name": "B03", "multiply_factor": 1e-4},
    {"batch_files": ["B04.tif"], "feature_type": "data", "feature_name": "B04", "multiply_factor": 1e-4},
    {"batch_files": ["B05.tif"], "feature_type": "data", "feature_name": "B05", "multiply_factor": 1e-4},
    {"batch_files": ["B06.tif"], "feature_type": "data", "feature_name": "B06", "multiply_factor": 1e-4},
    {"batch_files": ["B07.tif"], "feature_type": "data", "feature_name": "B07", "multiply_factor": 1e-4},
    {"batch_files": ["B08.tif"], "feature_type": "data", "feature_name": "B08", "multiply_factor": 1e-4},
    {"batch_files": ["B8A.tif"], "feature_type": "data", "feature_name": "B8A", "multiply_factor": 1e-4},
    {"batch_files": ["B09.tif"], "feature_type": "data", "feature_name": "B09", "multiply_factor": 1e-4},
    {"batch_files": ["B10.tif"], "feature_type": "data", "feature_name": "B10", "multiply_factor": 1e-4},
    {"batch_files": ["B11.tif"], "feature_type": "data", "feature_name": "B11", "multiply_factor": 1e-4},
    {"batch_files": ["B12.tif"], "feature_type": "data", "feature_name": "B12", "multiply_factor": 1e-4},
    {"batch_files": ["CLP.tif"], "feature_type": "data", "feature_name": "CLP", "multiply_factor": 0.00392156862745098},
    {"batch_files": ["CLM.tif"], "feature_type": "mask", "feature_name": "CLM"},
    {"batch_files": ["dataMask.tif"], "feature_type": "mask", "feature_name": "dataMask"}
  ],
  "userdata_feature_name": "BATCH_INFO",
  "userdata_timestamp_reader": "eogrow.utils.batch.read_timestamps_from_orbits",
  "**global_settings": "${config_path}/sentinel2_l1c_batch_config.json"
}

Let me know if you need to see what sentinel2_l1c_batch_config.json looks like.

@mlubej mlubej added help wanted Extra attention is needed question Further information is requested labels Feb 15, 2022
@mlubej
Copy link
Contributor Author

mlubej commented Feb 15, 2022

The data is there:

image

@zigaLuksic
Copy link
Collaborator

Ah, the eopatch is Optional[EOPatch] but apparently we forgot to add a default value.

@mlubej mlubej added bug Something isn't working and removed help wanted Extra attention is needed question Further information is requested labels Feb 15, 2022
@mlubej mlubej changed the title [HELP] Issues running the batch_to_eopatch pipeline [BUG] Issues running the batch_to_eopatch pipeline Feb 15, 2022
@mlubej
Copy link
Contributor Author

mlubej commented Feb 15, 2022

Thanks for the hint. I tried setting the default to None and got a new error:

❯ eogrow 01_batch_to_eopatch.json
INFO eogrow.core.pipeline:216: Running BatchToEOPatchPipeline
INFO eogrow.core.area.base:176: Loading grid from cache/grid_test_area_BatchAreaManager_0.2_0.004_1_10.0_0.gpkg
INFO eogrow.core.pipeline:159: Searching for Ray cluster
INFO eogrow.core.pipeline:164: No cluster found, pipeline will not use Ray.
INFO eogrow.core.pipeline:174: Starting EOExecutor for 14 EOPatches
  0%|                                                                                                             | 0/14 [00:00<?, ?it/s]Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
Warning 1: TIFFReadDirectory:Sum of Photometric type-related color channels and ExtraSamples doesn't match SamplesPerPixel. Defining non-color channels as ExtraSamples.
  0%|                                                                                                             | 0/14 [00:04<?, ?it/s]
Traceback (most recent call last):
  File "/Users/mlubej/.pyenv/versions/surs/bin/eogrow", line 33, in <module>
    sys.exit(load_entry_point('eo-grow', 'console_scripts', 'eogrow')())
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/cli.py", line 80, in main
    pipeline.run()
  File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/core/pipeline.py", line 220, in run
    finished, failed = self.run_procedure()
  File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/core/pipeline.py", line 263, in run_procedure
    finished, failed, _ = self.run_execution(workflow, exec_args)
  File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/core/pipeline.py", line 185, in run_execution
    execution_results = executor.run(**executor_run_params)
  File "/Users/mlubej/work/projects/sh-project/eo-learn/core/eolearn/core/eoexecution.py", line 187, in run
    full_execution_results = self._run_execution(processing_args, workers, processing_type)
  File "/Users/mlubej/work/projects/sh-project/eo-learn/core/eolearn/core/eoexecution.py", line 219, in _run_execution
    return submit_and_monitor_execution(process_executor, self._execute_workflow, processing_args)
  File "/Users/mlubej/work/projects/sh-project/eo-learn/core/eolearn/core/eoexecution.py", line 398, in submit_and_monitor_execution
    results[future_order[future]] = future.result()
  File "/Users/mlubej/.pyenv/versions/3.8.7/lib/python3.8/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/Users/mlubej/.pyenv/versions/3.8.7/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/Users/mlubej/.pyenv/versions/3.8.7/lib/python3.8/logging/__init__.py", line 2123, in shutdown
    h.close()
  File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/core/logging.py", line 253, in close
    self.local_file.close()
  File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/utils/fs.py", line 90, in close
    self.copy_to_remote()
  File "/Users/mlubej/work/projects/sh-project/eo-grow/eogrow/utils/fs.py", line 103, in copy_to_remote
    fs.copy.copy_file(self._filesystem, self._local_path, self._remote_filesystem, self._remote_path)
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/fs/copy.py", line 142, in copy_file
    copy_file_if(
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/fs/copy.py", line 221, in copy_file_if
    copy_file_internal(
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/fs/copy.py", line 277, in copy_file_internal
    _copy_locked()
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/fs/copy.py", line 270, in _copy_locked
    dst_fs.upload(dst_path, read_file)
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/fs_s3fs/_s3fs.py", line 774, in upload
    self.client.upload_fileobj(
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/boto3/s3/inject.py", line 537, in upload_fileobj
    future = manager.upload(
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/s3transfer/manager.py", line 329, in upload
    return self._submit_transfer(
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/s3transfer/manager.py", line 524, in _submit_transfer
    self._submission_executor.submit(
  File "/Users/mlubej/.pyenv/versions/3.8.7/envs/surs/lib/python3.8/site-packages/s3transfer/futures.py", line 474, in submit
    future = ExecutorFuture(self._executor.submit(task))
  File "/Users/mlubej/.pyenv/versions/3.8.7/lib/python3.8/concurrent/futures/thread.py", line 181, in submit
    raise RuntimeError('cannot schedule new futures after '
RuntimeError: cannot schedule new futures after interpreter shutdown

@zigaLuksic
Copy link
Collaborator

We discovered that the issue is not in multithreading but instead lies in reading tiffs with ImportFromTiffTask. Investigating further.

@mlubej
Copy link
Contributor Author

mlubej commented Feb 17, 2022

Fixed in #15

@mlubej mlubej closed this as completed Feb 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants