Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: cannot import name 'GeoJSONSource' from 'geobeam.io' #28

Closed
migaelhartzenberg opened this issue Mar 4, 2022 · 3 comments · Fixed by #32
Closed

ImportError: cannot import name 'GeoJSONSource' from 'geobeam.io' #28

migaelhartzenberg opened this issue Mar 4, 2022 · 3 comments · Fixed by #32

Comments

@migaelhartzenberg
Copy link

migaelhartzenberg commented Mar 4, 2022

I am fairly new to python and Apache beam, however, I used the shapefile_nfhl.py as an example to create a reader for GeoJSON files, therefore I imported the GeoJSONSource (as per documentation) from geobeam.io but when I run the application I get the following error ImportError: cannot import name 'GeoJSONSource' from 'geobeam.io'

Am I missing something as I did follow the instructions to install geobeam. pip install geobeam

I have tried this with python 3.7, 3.9 and 3.10, versions 3.7 and 3.9 gives this error where as 3.10 does not work at all - getting issues while installing rasterio.

I am also running this on macOS Monterey (12.2.1)

Here is my code:

def run(pipeline_args, known_args): 
    import apache_beam as beam
    from apache_beam.io.gcp.internal.clients import bigquery as beam_bigquery
    from apache_beam.options.pipeline_options import PipelineOptions, SetupOptions
    from geobeam.io import GeoJSONSource
    from geobeam.fn import format_record, make_valid, filter_invalid

    pipeline_options = PipelineOptions([
        '--experiments', 'use_beam_bq_sink',
    ] + pipeline_args)

    with beam.Pipeline(options=pipeline_options) as p:
        (p
         | beam.io.Read(GeoJSONSource(known_args.gcs_url,
             layer_name=known_args.layer_name))
         | 'MakeValid' >> beam.Map(make_valid)
         | 'FilterInvalid' >> beam.Filter(filter_invalid)
         | 'FormatRecords' >> beam.Map(format_record)
         | 'WriteToBigQuery' >> beam.io.WriteToBigQuery(
             beam_bigquery.TableReference(
                 datasetId=known_args.dataset,
                 tableId=known_args.table),
             method=beam.io.WriteToBigQuery.Method.FILE_LOADS,
             write_disposition=beam.io.BigQueryDisposition.WRITE_TRUNCATE,
             create_disposition=beam.io.BigQueryDisposition.CREATE_NEVER))


if __name__ == '__main__':
    import logging
    import argparse

    logging.getLogger().setLevel(logging.INFO)

    parser = argparse.ArgumentParser()
    parser.add_argument('--gcs_url')
    parser.add_argument('--dataset')
    parser.add_argument('--table')
    parser.add_argument('--layer_name')
    parser.add_argument('--in_epsg', type=int, default=None)
    known_args, pipeline_args = parser.parse_known_args()

    run(pipeline_args, known_args)```
@tjwebb
Copy link
Member

tjwebb commented Mar 7, 2022

Thanks for the issue, I'll take a look this week.

I have not tested against python 3.10 yet.

@tjwebb
Copy link
Member

tjwebb commented Mar 7, 2022

possible dupe of #19

@migaelhartzenberg
Copy link
Author

migaelhartzenberg commented Mar 8, 2022

possible dupe of #19

I saw this issue, but since they mentioned ...no attribute I thought it can be a different issue.

UPDATE
When installing from GitHub Package this error disappears.
pip install git+https://github.com/GoogleCloudPlatform/dataflow-geobeam#egg=geobeam

However, then I get the same error described in #19

@tjwebb tjwebb closed this as completed in #32 Mar 9, 2022
tjwebb added a commit that referenced this issue Mar 9, 2022
fix GeoJSONSource, add example
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants