Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Mar 4, 2023

Which issue does this PR close?

Part of #1754

Rationale for this change

I am trying to extract the physical_plan code into its own crate; and to do so I need to remove the circular dependencies between core --> datasource --> execution --> datasource

Table factories are used for planning, not for execution, but RuntimeEnv is used for execution (and I am trying to move it into the datafusion_execution crate)

See more details in #1754 (comment)

What changes are included in this PR?

  1. Move table_factories off of RuntimeEnv and directly to SessionState

Are these changes tested?

Covered by existing tests (and the test changes illustrate what happened to the API). I believe this will affect Ballista (@avantgardnerio ) and delta-rs (cc @roeap )

Are there any user-facing changes?

Users who are specifying custom TableFactoryProviders have a slightly different API to register them.

@alamb alamb added the api change Changes the API exposed to users of the crate label Mar 4, 2023
@github-actions github-actions bot added the core Core DataFusion crate label Mar 4, 2023
self.session_id.clone()
}

/// Return the [`TableFactoryProvider`] that is registered for the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Return the [`TableFactoryProvider`] that is registered for the
/// Return the [`TableProviderFactory`] that is registered for the

};
use std::collections::HashMap;

use crate::datasource::datasource::TableProviderFactory;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole point of the PR is to remove the datasource dependency from the runtime_env module (as I am trying to move the runtime_env module into a new crate, datafusion-execution, that does not depend on datasource)

I plan to move object_store registry in a separate PR to keep the reviews smaller

pub fn new() -> Self {
let mut table_factories: HashMap<String, Arc<dyn TableProviderFactory>> =
HashMap::new();
table_factories.insert("PARQUET".into(), Arc::new(ListingTableFactory::new()));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this construction now happens as part of creating SessionState

let env = RuntimeEnv::new(cfg).unwrap();
let ses = SessionConfig::new();
let ctx = SessionContext::with_config_rt(ses, Arc::new(env));
let mut state = SessionState::with_config_rt(ses, Arc::new(env));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the new pattern of how to add a table provider factory

@alamb alamb changed the title Minor: Move TableProviderFactories up out of RuntimeEnv and into Sess… Minor: Move TableProviderFactories up out of RuntimeEnv and into SessionState Mar 4, 2023
@avantgardnerio avantgardnerio self-requested a review March 6, 2023 18:56
Copy link
Contributor

@avantgardnerio avantgardnerio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems okay...

@alamb
Copy link
Contributor Author

alamb commented Mar 7, 2023

Thank you for the review @avantgardnerio

@alamb alamb merged commit deeaa56 into apache:main Mar 7, 2023
@alamb alamb deleted the alamb/sever_runtime_env_dep branch October 18, 2024 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api change Changes the API exposed to users of the crate core Core DataFusion crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants