Skip to content

Fix SparkSession.getActiveSession() returning None in connectivity metrics#1985

Merged
eKathleenCarter merged 5 commits intomainfrom
ekcarter/bugfix-SparkSession.getActiveSession
Dec 17, 2025
Merged

Fix SparkSession.getActiveSession() returning None in connectivity metrics#1985
eKathleenCarter merged 5 commits intomainfrom
ekcarter/bugfix-SparkSession.getActiveSession

Conversation

@eKathleenCarter
Copy link
Copy Markdown
Collaborator

update connectivity_metrics.py to use ps.SparkSession.builder.getOrCreate()

Description of the changes

Fixes / Resolves the following issues:

Checklist:

  • Added label to PR (e.g. enhancement or bug)
  • Ensured the PR is named descriptively. FYI: This name is used as part of our changelog & release notes.
  • Looked at the diff on github to make sure no unwanted files have been committed.
  • Made corresponding changes to the documentation
  • Added tests that prove my fix is effective or that my feature works
  • Any dependent changes have been merged and published in downstream modules
  • If breaking changes occur or you need everyone to run a command locally after
    pulling in latest main, uncomment the below "Merge Notification" section and
    describe steps necessary for people
  • Ran on sample data using kedro run -e sample -p test_sample (see sample environment guide)

update connectivity_metrics.py to use ps.SparkSession.builder.getOrCreate()
@eKathleenCarter eKathleenCarter self-assigned this Dec 8, 2025
@eKathleenCarter eKathleenCarter marked this pull request as ready for review December 8, 2025 15:06
@eKathleenCarter eKathleenCarter requested a review from a team as a code owner December 8, 2025 15:06
Comment thread pipelines/matrix/src/matrix/pipelines/integration/connectivity_metrics.py Outdated
@eKathleenCarter
Copy link
Copy Markdown
Collaborator Author

updated to graphframes-py>=0.10.0

still producing the same results.


[12/15/25 16:23:48] INFO     Found 1,840,933          connectivity_metrics.py:95
                             connected components                               
                             using GraphFrames.                                 
                             Largest component has                              
                             7,367,582 nodes.                                   
                    INFO     Saving data to                  data_catalog.py:445
                             integration.prm.connected_compo                    
                             nents_node_assignments                             
                             (LazySparkDataset)...                              
[12/15/25 16:23:59] INFO     Saving data to                  data_catalog.py:445
                             integration.prm.connected_compo                    
                             nents_stats                                        
                             (LazySparkDataset)...                              
[12/15/25 16:24:09] INFO     Completed node:                       runner.py:250
                             compute_connected_components                       
                    INFO     Completed 1 out of 1 tasks            runner.py:251
                    INFO     Pipeline execution completed          runner.py:135
                             successfully in 369.4 sec.                         

Copy link
Copy Markdown
Collaborator

@JacquesVergine JacquesVergine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Kathleen, great work!

@eKathleenCarter
Copy link
Copy Markdown
Collaborator Author

@JacquesVergine, something in CI seems to be hanging. Not sure what is happening here. The checks have not completed.

@eKathleenCarter eKathleenCarter merged commit ff82c0f into main Dec 17, 2025
12 of 14 checks passed
@eKathleenCarter eKathleenCarter deleted the ekcarter/bugfix-SparkSession.getActiveSession branch December 17, 2025 17:58
Dashing-Nelson added a commit that referenced this pull request Dec 18, 2025
…trics (#1985)

* Update connectivity_metrics.py

update connectivity_metrics.py to use ps.SparkSession.builder.getOrCreate()

* upgrading graphframes to graphframes-py

* Update connectivity_metrics.py

---------

Co-authored-by: Nelson Alfonso <45660392+Dashing-Nelson@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants