Skip to content

issues Search Results · repo:AlexIoannides/pyspark-example-project language:Python

Filter by

18 results
 (81 ms)

18 results

inAlexIoannides/pyspark-example-project (press backspace or delete to remove)

Hi seems like etl_config.json is not accessible when running in yarn client mode. Could you please me to investigate this issue?
  • Rustem
  • 1
  • Opened 
    on Dec 15, 2021
  • #30

ERROR: test_transform_data (tests.test_etl_job.SparkETLTests) Test data transformer. Traceback (most recent call last): File /home/brix/pyspark-workloads/pyspark-example-project/tests/test_etl_job.py ...
  • marouenes
  • Opened 
    on Aug 5, 2021
  • #28

Please add the LICENSE file to the repo so that one is sure of how to use it in closed-source codebases. See here
  • ajknzhol
  • 1
  • Opened 
    on Mar 31, 2021
  • #27

Could you add functionality to pass Job level parameters. Pass via parameters file maybe?
  • sou-joshi
  • Opened 
    on Mar 22, 2021
  • #25

when I run the code with following command . $spark-submit --master local[*] jobs/reconciliation.py I get the error ModuleNotFoundError: No module named dependencies Its because jobs and dependencies ...
  • mohit-manna
  • Opened 
    on Jan 28, 2021
  • #24

File /home/ashish/Downloads/pyspark-example-project-master/jobs/etl_job.py , line 57, in main data_transformed = transform_data(data, config[ steps_per_floor ]) TypeError: NoneType object is not subscriptable ...
  • averma111
  • Opened 
    on Oct 3, 2020
  • #23

https://github.com/AlexIoannides/pyspark-example-project/blob/13d6fb2f5fb45135499dbd1bc3f1bdac5b8451db/tests/test_etl_job.py#L64 You should use data_transformed not expected_data for actual transformation ...
  • minhsphuc12
  • 1
  • Opened 
    on Sep 30, 2020
  • #22

First of all, thanks for the great work! I am new to spark and this repo has really helped me getting started. I am trying to get my etl job running on aws EMR in cluster mode, but got hit with an issue ...
  • junjchen
  • 1
  • Opened 
    on Sep 25, 2020
  • #21

if they are not class methods then the method would be invoked for every test and a session would be created for each of those tests. `class PySparkTest(unittest.TestCase): @classmethod def suppress_py4j_logging(cls): ...
enhancement
good first issue
  • amrishan
  • 1
  • Opened 
    on Sep 16, 2020
  • #20

When I import from pyspark import SparkFiles, I got the error AttributeError: module logging has no attribute Handler . Python version: 3.8.5 Spark version: 3.0.0 pyspark version3.0.1 Anyone knows how ...
  • kychanbp
  • Opened 
    on Sep 11, 2020
  • #19
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Press the
/
key to activate the search input again and adjust your query.
Issue search results · GitHub