Skip to content

FusionDB/fql-training

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
src
 
 
 
 

FusionDB® FQL Training

This repository provides a training for FusionDB FQL.

In this training you will learn to:

  • run SQL queries on FusionDB
  • run SQL++ queries on FusionDB
  • use FusionDB's SQL JDBC Server
  • write the result of SQL queries to RDBMS (MySQL、PostgreSQL、Oracle)、S3、ADLS and HDFS

Requirements

  • Remote HDFS or RDBMS (MySQL、PostgreSQL、Oracle)/S3/ADLS/GCP/OSS etc.
  • Optional: Jupyter Notebook、PSequel

You only need Docker to run this training.
You don't need Java, Scala, or an IDE.

For more information, please refer to FusionDB Document

Quickstart

Fusiondb is a simple and powerful federated database engine.

  • Start FusionDB
docker run --name fdb -p 54322:54322 -itd fusiondb/fusiondb:0.1.0-beta
  • Check FusionDB server
docker ps -a|grep fdb

lsof -i :54322
  • Psycopg2 connecting FusionDB
import psycopg2
import pandas as pd
connection = psycopg2.connect("host=localhost port=54322 dbname=default user=fdb sslmode=disable")
df = pd.read_sql(sql="SELECT * FROM VALUES (1, 1), (1, 2) AS t(a, b);", con=connection)
df

## Load mysql table

df =pd.read_sql(sql="load 'mysql' options('url'='jdbc:mysql://localhost:53306/fdb_test','dbtable'='person','user'= 'root','password'='root') AS mysql_t2;", con=connection)
df =pd.read_sql(sql="SELECT * FROM mysql_t2;", con=connection)
df.head()

## Load Postgres table

df =pd.read_sql(sql="load 'postgresql' options('url'='jdbc:postgresql://localhost:15430/fdb','dbtable'='person','user'= 'fdb','password'='fdb123') AS gp_t1;", con=connection)
df =pd.read_sql(sql="SELECT * FROM gp_t1;", con=connection)
df.head()


## MySQL Table Join PostgreSQL Table

df =pd.read_sql(sql="CREATE table test as SELECT mysql_t2.* FROM mysql_t2 LEFT JOIN gp_t1 ON mysql_t2.id = gp_t1.id;", con=connection)
df =pd.read_sql(sql="SELECT * FROM test;", con=connection)
df.head()

## Load oracle table 

df =pd.read_sql(sql="load oracle options('url'='jdbc:oracle:thin:SYSTEM/oracle@//localhost:49161/xe','dbtable'='FDBTEST20','user'= 'SYSTEM','password'='oracle') AS ora_t1;", con=connection)
df =pd.read_sql(sql="SELECT * FROM ora_t1 limit 10;", con=connection)
df.head()


## Load HDFS parquet

df =pd.read_sql(sql="load 'hdfs://jdp-1:8020/tmp/spark-tpcds-data/web_site' format parquet AS web_site;", con=connection)
df =pd.read_sql(sql="SELECT * FROM web_site limit 10;", con=connection)
df.head()

## Save table to hdfs

df =pd.read_sql(sql="save overwrite web_site TO 'hdfs://jdp-1:8020/tmp/web_site_test' FORMAT parquet;", con=connection)
df =pd.read_sql(sql="load 'hdfs://jpd-1:8020/tmp/web_site_test' format parquet AS mysql_t2_par;", con=connection)
df =pd.read_sql(sql="SELECT * FROM mysql_t2_par limit 10;", con=connection)
df.head()

For more information, please refer to FusionDB Document