Making Hyperswitch Server support for mysql with diesel library #4191

CuriousCorrelation · 2024-03-26T06:55:43Z

Discussed in #4179

^{Originally posted by manojradhakrishnan March 22, 2024}

Context

To keep things simpler, we initially chose PSQL for building Hyperswitch. However the vision was to keep Hyperswitch pluggable with any database technology.

Since we adopted Rust as the programming language, we chose diesel as our database communication library, due to its ORM nature.

Problem

With MySQL having greater than 40%+ market share in the Relational Database market, we would prefer to have Hyperswitch compatible with MySQL.

Diesel supports communication with PSQL, MySQL and SQLite - Hyperswitch is currently built to support only PSQL. The challenge is to extend the support to MySQL.

How to get started?

Setup Hyperswitch by following these instructions on your local machine and make a test payment. This is also a prerequisite.
Configure diesel with mysql and try running the application
Discover the gaps that needs to be addressed in order to successfully run Hyperswitch with MySQL support

Expected Outcome

Share a proposal on how you can help us implement this in a clean manner. Create the proposal as Github issue and share under this discussion.

RFC for MySQL as another storage backend

Two ways we could go about this:

Translate db dependent module/functions piece by piece with perhaps some code duplication, finally exporting a pub type MysqlPool = Pool<ConnectionManager<MysqlConnection>>; alongside pub type PgPool = Pool<ConnectionManager<PgConnection>>;.
Abstract database operations as storage services.

Both will leverage diesel and conditional compilation, and while either method may take a while to prod given the idiosyncrasies between Postgres and MySQL, I think I have a couple of ideas that could make it a bit easier. Also there are some ways to make a general framework that will help other future integrations, which could be a separate RFC, but for now this covers both.

Proposed Changes

Aside from the very obvious

diesel = { version = "2.1.0", features = ["postgres", "mysql"] }

we start with a combination of generic database module, adaptor/service layer and repository pattern.

Compiled features

Using conditional compilation, i.e. something like #[cfg(feature = "mysql")], we can export database definitions like this

#[cfg(feature = "mysql")]
pub mod mysql;

#[cfg(feature = "postgres")]
pub mod postgres;

#[cfg(feature = "sqlite")]
pub mod sqlite;

essentially separating implementation details by modules.

We could then store connection info like so

pub enum ConnectionInfo {
    #[cfg(feature = "postgresql")]
    Postgres(PostgresUrl),

    #[cfg(feature = "mysql")]
    Mysql(MysqlUrl),

    #[cfg(feature = "sqlite")]
    Sqlite {
        file_path: String,
        db_name: String,
    },
}

Because these database connection strings are actual valid URLs, we can sanitize them like this

let url = storage_impl::MysqlUrl::new(url::Url::parse(conn_str)?)?;

to use rust's type system as validators, i.e. Postgres(PostgresUrl) instead of a &'a str, although it's not strictly necessary.

Of course error propagation will be different

#[cfg(feature = "mysql")]
pub use storage_impl::mysql::MysqlError;

#[cfg(feature = "postgres")]
pub use storage_impl::postgres::PostgresError;

#[cfg(feature = "sqlite")]
pub use storage_impl::sqlite::SqliteError;

which we will encapsulate in say StoreError.

Depending on how the app is hosted, it'd be easy but not trivial to compile different binaries for different storage backends. Should be possible with docker, I don't think we'll need full container orchestration.

Then we can define one singular "database" struct to interact with

#[derive(Clone)]
pub struct GenericDatabase {
    pub(crate) inner: Pool<SomeConnectionManager>,
    pub(crate) connection_info: Arc<ConnectionInfo>,
}

it'll basically act as an adaptor (not a service layer yet!).

Not totally sure about the field visibility just yet. I also suspect ConnectionInfo would need to be thread safe, so Arc. Not sure if we'll need Mutex.

We could then also switch between backends based on connection string in the env like so

   match connection_string {
   #[cfg(feature = "mysql")]
   conn_str if conn_str.starts_with("mysql") => {
       let url = storage_impl::MysqlUrl::new(url::Url::parse(conn_str)?)?;

       let connection_limit = url.connection_limit();
       let pool_timeout = url.pool_timeout();
       let max_connection_lifetime = url.max_connection_lifetime();
       let max_idle_connection_lifetime = url.max_idle_connection_lifetime();

       // Manager could be another abstraction over r2d2/bb8's pool.
       let manager = SomeConnectionManager::Mysql { url };

       // This is fairly standard.
       let mut builder = SomeConnectionBuilder::new(s, manager)?;

       if let Some(limit) = connection_limit {
           builder.connection_limit(limit);
       }

      ...
      ...
      ...

       Ok(builder)
   }
   #[cfg(feature = "postgresql")]
   conn_str if conn_str.starts_with("postgres") => {
       let url = storage_impl::PostgresUrl::new(url::Url::parse(conn_str)?)?;

        // Pretty much same as above.

       Ok(builder)
   }
   ...
   ...
   ...

I think this should be enough as a base framework to start porting.

Although we could go further and have a standard way to plug whichever storage solution we prefer, as long as it adhere to a certain contract, or put simply - dependency injection. Essentially we either use storage solutions as services, or go even more modular with each database operation be its own service.

Storage as service (Pluggable storage framework)

We write a Store trait that'll abstract away db ops to a single request-response gateway (inspired by message queuing):

// Need this to make async fn work in traits.
#[async_trait] 
trait Store<Request> {
    type Response;
    type Error;
    
    async fn call(&mut self, req: Request) -> Result<Self::Response, Self::Error>;
}

Here a 'store' is just a service that takes in a request and produces a response, it doesn't care about the exact implementation details, which is just what we need.

It obviously needs to be generic over request. We'll also type-erase it to fit certain criteria later.

It's similar to something we already have

storage_impl/src/database/store.rs

#[async_trait::async_trait]
pub trait DatabaseStore: Clone + Send + Sync {
    type Config: Send;

    async fn new(config: Self::Config, test_transaction: bool) -> StorageResult<Self>;

    fn get_master_pool(&self) -> &PgPool;

    fn get_replica_pool(&self) -> &PgPool;
}

but instead of being generic over config and producing concrete PgPool, we make it generic over entire database, or perhaps just one operation.

Instead of exporting

pub type PgPool = bb8::Pool<async_bb8_diesel::ConnectionManager<PgConnection>>;

we export storage specifications, i.e.

type CreatePayment = BoxStore<CreatePaymentRequest, CreatePaymentResponse, StoreError>;
type GetPayment = BoxStore<PaymentId, PaymentModel, StoreError>;
type RefundCount = BoxStore<PaymentId, RefundCount, StoreError>;

where BoxStore is just a Box over Store because I suspect we will need some type easure to make it work.

StoreError could be something like this

pub(crate) enum StoreError {
    #[error("SQL - {0}")]
    Sql(diesel::result::Error),

    #[error("ORM - {0}")]
    R2d2(#[from] diesel::r2d2::PoolError),

    // Other backend-specific errors
}

diesel::result::Error is already fairly encompassing, so I don't think we'll need backend specific error aside from ones risen from those idiosyncrasies I mentioned earlier.

So instead of, for example

// from `crates/routers/src/types/storage/refund.rs`

async fn get_refunds_count(
    conn: &PgPooledConn,
    merchant_id: &str,
    refund_list_details: &api_models::refunds::RefundListRequest,
) -> CustomResult<i64, errors::DatabaseError> { ... }

we can

struct RefundCountService {
    refund_count_store: RefundCount<PaymentId, RefundCount>,
    ... : SomeOtherSpec<PaymentId, AResponse>,
}

impl RefundCountService {
    fn get_refunds_count(&self, PaymentId) -> ServiceResult<RefundCount> {
    ...
    }
}

which could be built like so

impl PostgresStore for RefundCountService {
    fn postgres_store(store: Postgres) -> Self {
        Self::builder()
            .refund_count_store(BoxService::new(store.clone()))
            .some_other_spec(BoxService::new(store.clone()))
            ...
            ...
            ...
            .build()
    }
}

and given to a route like this

pub(crate) fn postgres_routes(store: Postgres) -> Router {
    Router::new()
        .route(
            "/refund_count",
            routing::post(refund_count).with_state(SyncService::new(RefundCountService::postgres_store(store.clone()))),
        )
        ...
        ...
        ...
}

Here SyncService is just

#[derive(Clone)]
pub(crate) struct SyncService<Service>(Arc<Mutex<Service>>);

impl<Service> SyncService<Service> {
    pub(crate) fn new(service: Service) -> Self {
        Self(Arc::new(Mutex::new(service)))
    }
}

to make sure it follows actix's contract so it can be shared between routes. There might be some ways to avoid Mutex on service, instead we add Mutex on some minimal state that can be shared with cheaper lock/unlocks. Just a speculation.

Also this is axum example, but actix should also work just like that.

And when building the app, we simply delegate each route to db

let app = Router::new()
    .route("/", get(route::index::index))
    .nest("/api/v1/", route::api::v1::postgres(store.clone()))
    ...
    ...
    ...

again axum example, should work with actix.

Here store.clone() is just url configs, don't need to worry about clone impacting performace.

Concluding thoughts

While the process may look involved at first, it essentially boils down to simple dependency injection and separation of concerns at service layer.

Of course abstracting away storage via traits at database level is fairly standard but the main upshot of abstraction at database ops level would be the ability to have multiple databases supporting different operations, for example, time series data - while relations db can be a good choice, a db dedicated for time series storage with better optimizations can be a very performant choice.

A side note

If we don't want async_trait on trait Store<Request> (mainly for performance, to avoid boxing futures) and we can spend some time implementing a state machine and use type_alias_impl_trait, then we could do this instead

trait Store<Request> {
    type Response;
    type Error;
    type Future: Future
    where
        <Self::Future as Future>::Output == Result<Self::Response, Self::Error>;

    fn call(&mut self, req: Request) -> Self::Future;
}

The text was updated successfully, but these errors were encountered:

ssarkar551 mentioned this issue Mar 28, 2024

Making Hyperswitch Server support for mysql with diesel library #4185

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making Hyperswitch Server support for mysql with diesel library #4191

Making Hyperswitch Server support for mysql with diesel library #4191

CuriousCorrelation commented Mar 26, 2024

Context

Problem

How to get started?

Expected Outcome

Making Hyperswitch Server support for mysql with diesel library #4191

Making Hyperswitch Server support for mysql with diesel library #4191

Comments

CuriousCorrelation commented Mar 26, 2024

Discussed in #4179

Context

Problem

How to get started?

Expected Outcome

RFC for MySQL as another storage backend

Proposed Changes

Compiled features

Storage as service (Pluggable storage framework)

Concluding thoughts

A side note