Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySQL's ResultSet of SELECT * is wrong when with Masking Feature #27879

Closed
134130 opened this issue Aug 3, 2023 · 11 comments
Closed

MySQL's ResultSet of SELECT * is wrong when with Masking Feature #27879

134130 opened this issue Aug 3, 2023 · 11 comments

Comments

@134130
Copy link
Contributor

134130 commented Aug 3, 2023

Bug Report

For English only, other languages will not accept.

Before report a bug, make sure you have:

Please pay attention on issues you submitted, because we maybe need more details.
If no response anymore and we cannot reproduce it on current information, we will close it.

Please answer these questions before submitting your issue. Thanks!

Which version of ShardingSphere did you use?

5.4.0

Which project did you use? ShardingSphere-JDBC or ShardingSphere-Proxy?

ShardingSphere-JDBC with SharingSphere-Mask

Expected behavior

[
  {
    "actor_id": 77,
    "film_id": 8,
    "last_update": "2006-02-15 05:03:42",
    "first_name": "CARY",
    "last_name": "M********Y",
    "title": "AIRPORT POLLOCK",
    "description": "A Epic Tale of a Moose And a Girl who must Confront a Monkey in Ancient India",
    "release_year": 2006,
    "language_id": 1,
    "original_language_id": null,
    "rental_duration": 6,
    "rental_rate": 4.99,
    "length": 54,
    "replacement_cost": 15.99,
    "rating": "R",
    "special_features": "Trailers"
  }
]

Actual behavior

[
  {
    "actor_id": 1,
    "first_name": 1,
    "last_name": "2006-02-15 05:05:03",
    "last_update": 79
  }
]

Reason analyze (If you can)

May MaskingContext pollute MetadataContext

Steps to reproduce the behavior, such as: SQL to execute, sharding rule configuration, when exception occur etc.

  1. build.gradle.kts

    dependencies {
        // ShardingSphere
        implementation("org.apache.shardingsphere:shardingsphere:5.4.0")
        implementation("org.apache.shardingsphere:shardingsphere-jdbc:5.4.0")
        implementation("org.apache.shardingsphere:shardingsphere-jdbc-core:5.4.0")
    
        // ShardingSphere Mask
        implementation("org.apache.shardingsphere:shardingsphere-mask:5.4.0")
        implementation("org.apache.shardingsphere:shardingsphere-mask-api:5.4.0")
        implementation("org.apache.shardingsphere:shardingsphere-mask-core:5.4.0")
        implementation("org.apache.shardingsphere:shardingsphere-mask-distsql:5.4.0")
        implementation("org.apache.shardingsphere:shardingsphere-mask-distsql-parser:5.4.0")
        implementation("org.apache.shardingsphere:shardingsphere-mask-distsql-statement:5.4.0")
        implementation("org.apache.shardingsphere:shardingsphere-mask-distsql-handler:5.4.0")
        implementation("org.apache.shardingsphere:shardingsphere-mask-distsql-parser:5.4.0")
    
        // jdbc
        runtimeOnly("com.mysql:mysql-connector-j")
    }
  2. Prepare MySQL

    docker run --name mysql -e MYSQL_ROOT_PASSWORD=root -d -p 3306:3306 mysql
  3. Run these SQLs

    -- Sakila MySQL Sample Database: https://dev.mysql.com/doc/sakila/en/
    -- Use sakila or run this query.
    
    CREATE DATABASE sakila;
    USE sakila;
    
    -- DDL
    CREATE TABLE `actor` (
      `actor_id` smallint unsigned NOT NULL AUTO_INCREMENT,
      `first_name` varchar(45) NOT NULL,
      `last_name` varchar(45) NOT NULL,
      `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
      PRIMARY KEY (`actor_id`)
    );
    CREATE TABLE `film` (
      `film_id` smallint unsigned NOT NULL AUTO_INCREMENT,
      `title` varchar(128) NOT NULL,
      `description` text,
      `release_year` year DEFAULT NULL,
      `language_id` tinyint unsigned NOT NULL,
      `original_language_id` tinyint unsigned DEFAULT NULL,
      `rental_duration` tinyint unsigned NOT NULL DEFAULT '3',
      `rental_rate` decimal(4,2) NOT NULL DEFAULT '4.99',
      `length` smallint unsigned DEFAULT NULL,
      `replacement_cost` decimal(5,2) NOT NULL DEFAULT '19.99',
      `rating` enum('G','PG','PG-13','R','NC-17') DEFAULT 'G',
      `special_features` set('Trailers','Commentaries','Deleted Scenes','Behind the Scenes') DEFAULT NULL,
      `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
      PRIMARY KEY (`film_id`)
    );
    CREATE TABLE `film_actor` (
      `actor_id` smallint unsigned NOT NULL,
      `film_id` smallint unsigned NOT NULL,
      `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
      PRIMARY KEY (`actor_id`,`film_id`)
    );
    
    -- DML
    INSERT INTO sakila.actor (actor_id, first_name, last_name, last_update, social_number) VALUES (1, 'PENELOPE', 'GUINESS', '2006-02-15 04:34:33', '010-1234-1234');
    INSERT INTO sakila.film (film_id, title, description, release_year, language_id, original_language_id, rental_duration, rental_rate, length, replacement_cost, rating, special_features, last_update) VALUES (1, 'ACADEMY DINOSAUR', 'A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian Rockies', 2006, 1, null, 6, 0.99, 86, 20.99, 'PG', 'Deleted Scenes,Behind the Scenes', '2006-02-15 05:03:42');
    INSERT INTO sakila.film_actor (actor_id, film_id, last_update) VALUES (1, 1, '2006-02-15 05:05:03');
  4. Run this Query with ShardingSphere-Masking

    SELECT *
    FROM film_actor
             JOIN actor AS a ON a.actor_id
             JOIN film AS f ON f.film_id
    LIMIT 1;

Example codes for reproduce this issue (such as a github link).

fun main() {

	// Hikari
	val hikariDataSource = HikariDataSource().apply {
		driverClassName = "com.mysql.jdbc.Driver"
		jdbcUrl = "jdbc:mysql://localhost:3306/sakila"
		username = "root"
		password = "root"
	}
	
	// ShardingSphere
	val maskRuleConfiguration = MaskRuleConfiguration(
		Collections.singleton(MaskTableRuleConfiguration ("actor", listOf(
			MaskColumnRuleConfiguration("first_name", "first_name_mask"),
		))),
		mapOf(
			"first_name_mask" to AlgorithmConfiguration("KEEP_FIRST_N_LAST_M", Properties().apply {
				setProperty("first-n", "1")
				setProperty("last-m", "1")
				setProperty("replace-char", "*")
			}),
		)
	)
	
	val dataSource = ShardingSphereDataSourceFactory.createDataSource(
		hikariDataSource,
		listOf(maskRuleConfiguration),
		Properties()
	)

	// Query 
	val sql = "SELECT * FROM film_actor\n" +
		  "         JOIN actor AS a ON a.actor_id\n" +
		  "         JOIN film AS f ON f.film_id\n" +
		  "LIMIT 1"

	// Query directly (Expected)
	query(sql, hikariDataSource.connection)

	// Query with ShardingSphere (Actual)
	query(sql, dataSource.connection)
}

fun query(sql: String, connection: Connection) {
	val statement = connection.createStatement()
	val resultSet = statement.executeQuery(sql)

	while (resultSet.next()) {
		for (i in 1..resultSet.metaData.columnCount) {
			println("${resultSet.metaData.getColumnName(i)}: ${resultSet.getString(i)} ")
		}
		println()
	}
}
@azexcy
Copy link
Contributor

azexcy commented Aug 3, 2023

Hi @zhaojinchao95 , could you help with this issue?

@134130 134130 changed the title SELECT * statement of MySQL's ResultSet is wrong when with Masking Feature MySQL's ResultSet of SELECT * is wrong when with Masking Feature Aug 3, 2023
@134130
Copy link
Contributor Author

134130 commented Aug 3, 2023

Also, the result is also broken too when add both of masking rules.

actor_id: 1
first_name: 1
last_name: 2006-02-15 05:05:03.0
last_update: 127
film_id: KEVIN
title: G*****D
description: 2006-02-15 04:34:33.0
release_year: 9
language_id: ALABAMA DEVIL
original_language_id: A Thoughtful Panorama of a Database Administrator And a Mad Scientist who must Outgun a Mad Scientist in A Jet Boat
rental_duration: 2006-01-01
rental_rate: 1
length: null
replacement_cost: 3
rating: 2.99
special_features: 114
last_update: 21.99

@134130
Copy link
Contributor Author

134130 commented Aug 3, 2023

I debugged 6 cases ([no-masking, actor.first_name-masking, actor.first_name,film.title-masking] * 2 below queries)

  1.  SELECT *
     FROM film_actor
              JOIN actor AS a ON a.actor_id
              JOIN film AS f ON f.film_id
     LIMIT 1;
  2.  SELECT a.first_name, f.title
     FROM film_actor
              JOIN actor AS a ON a.actor_id
              JOIN film AS f ON f.film_id
     LIMIT 1;

First of all, ShardingSphere needs to predict result columns sequence correctly.

  1. Currently, ShardingSphere creates metadata before query from parsed data.
    1-a. When we give only one MaskTableRuleConfiguration to JOIN Query which takes 2 or more than 2 tables, creates only one projection, It makes The ShardingSphereResultSetUtils.createColumnLabelAndIndexMap() only predicates one table.
    1-b. When we give all of MaskTableRuleConfiguration to JOIN Query which takes many tables (Table N : N MaskTableRuleConfiguration), it still can breaks result since the MySQL's column sequence can be mixed. (In USING JOIN case, the column's order is mixed)
  2. 1st Query above is breaking since film_actor is missing when getProjections(). With this case, MySQL, film_actor's all fields should projected, actor's all fields should projected and film's all fileds should projected. Because ShardingSphereResultSetUtils.createColumnLabelAndIndexMap() is just arrange the columns, column names will collision (film_actor table and actor table have same column name "actor_id")

@strongduanmu
Copy link
Member

Hi @134130, can you try master branch? We made some optimizations for the asterisk expansion.

@134130
Copy link
Contributor Author

134130 commented Aug 4, 2023

@strongduanmu Hi, thanks for your comment.
When I tried little tests on master branch, It looks masking is not working, but I'll try and comment again. Thank you!

@strongduanmu
Copy link
Member

Thanks for testing and look forward to your feedback.

@134130
Copy link
Contributor Author

134130 commented Aug 6, 2023

  • No Mask
    • SELECT * : Passed ✅
    • SELECT a.first_name, f.title : Error ❌ -> Unknown column 'first_name' in 'field list'.
  • actor.first_name Mask
    • SELECT * : Fail ⚠️ -> Nothing masked
    • SELECT a.first_name, f.title : Error ❌ -> Unknown column 'title' in 'field list'.
  • actor.first_name and film.title Mask
    • SELECT * : Fail ⚠️ -> last_update field has masked
    • SELECT a.first_name, f.title : Passed ✅

@134130
Copy link
Contributor Author

134130 commented Aug 6, 2023

package org.apache.shardingsphere.driver;

import com.zaxxer.hikari.HikariDataSource;
import groovy.lang.Tuple2;
import org.apache.shardingsphere.driver.api.ShardingSphereDataSourceFactory;
import org.apache.shardingsphere.infra.config.algorithm.AlgorithmConfiguration;
import org.apache.shardingsphere.infra.config.rule.RuleConfiguration;
import org.apache.shardingsphere.mask.api.config.MaskRuleConfiguration;
import org.apache.shardingsphere.mask.api.config.rule.MaskColumnRuleConfiguration;
import org.apache.shardingsphere.mask.api.config.rule.MaskTableRuleConfiguration;
import org.junit.jupiter.api.Test;

import javax.sql.DataSource;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import java.util.Properties;

import static org.hamcrest.CoreMatchers.is;
import static org.hamcrest.MatcherAssert.assertThat;

public class CooperTests {

    private static List<Tuple2<String, Integer>> getSQLs() {
        List<Tuple2<String, Integer>> result = new ArrayList<>();
        result.add(new Tuple2<>("SELECT * FROM film_actor JOIN actor AS a ON a.actor_id JOIN film AS f ON f.film_id LIMIT 1", 21));
        result.add(new Tuple2<>("SELECT a.first_name, f.title FROM film_actor JOIN actor AS a ON a.actor_id JOIN film AS f ON f.film_id LIMIT 1", 2));
        return result;
    }
    
    @Test
    public void no_mask_test() throws SQLException {
        MaskRuleConfiguration maskRule = createMaskRule(Collections.emptyList());
        DataSource dataSource = createDataSource(Collections.singleton(maskRule));
        
        for (Tuple2<String, Integer> sql : getSQLs()) {
            try (Connection connection = dataSource.getConnection()) {
                ResultSet resultSet = connection.createStatement().executeQuery(sql.getV1());
                assertThat(resultSet.getMetaData().getColumnCount(), is(sql.getV2()));

                printResultSet(resultSet);
            }
        }
    }
    
    @Test
    public void first_name_mask_test() throws SQLException {
        MaskRuleConfiguration maskRule = createMaskRule(Collections.singleton(createMaskTableRule("actor", "first_name")));
        DataSource dataSource = createDataSource(Collections.singleton(maskRule));

        for (Tuple2<String, Integer> sql : getSQLs()) {
            try (Connection connection = dataSource.getConnection()) {
                ResultSet resultSet = connection.createStatement().executeQuery(sql.getV1());
                assertThat(resultSet.getMetaData().getColumnCount(), is(sql.getV2()));
            }
        }
    }
    
    @Test
    public void first_name_and_title_mask_test() throws SQLException {
        MaskRuleConfiguration actorMaskRule = createMaskRule(
                Collections.singleton(createMaskTableRule("actor", "first_name")));
        MaskRuleConfiguration filmMaskRule = createMaskRule(
                Collections.singleton(createMaskTableRule("film", "title")));
        
        Collection<RuleConfiguration> maskRules = Arrays.asList(actorMaskRule, filmMaskRule);
        DataSource dataSource = createDataSource(maskRules);

        for (Tuple2<String, Integer> sql : getSQLs()) {
            try (Connection connection = dataSource.getConnection()) {
                ResultSet resultSet = connection.createStatement().executeQuery(sql.getV1());
                assertThat(resultSet.getMetaData().getColumnCount(), is(sql.getV2()));
            }
        }
    }
    
    private void printResultSet(ResultSet resultSet) throws SQLException {
        while (resultSet.next()) {
            System.out.println(resultSet.getString(1) + ", " + resultSet.getString(2));
            System.out.println();
        }
        System.out.println("==============================");
    }
    
    private MaskRuleConfiguration createMaskRule(Collection<MaskTableRuleConfiguration> tableRules) {
        Properties props = new Properties();
        props.setProperty("first-n", "1");
        props.setProperty("last-m", "1");
        props.setProperty("replace-char", "*");
        
        AlgorithmConfiguration algorithm = new AlgorithmConfiguration("KEEP_FIRST_N_LAST_M", props);
        Map<String, AlgorithmConfiguration> algorithms = Collections.singletonMap(
                "default_cooper", algorithm);
        return new MaskRuleConfiguration(tableRules, algorithms);
    }
    
    private MaskTableRuleConfiguration createMaskTableRule(String table, String column) {
        MaskColumnRuleConfiguration columnRule = new MaskColumnRuleConfiguration(column, "default_cooper");
        return new MaskTableRuleConfiguration(table, Collections.singleton(columnRule));
    }
    
    private DataSource createDataSource(Collection<RuleConfiguration> configs) {
        try {
            return ShardingSphereDataSourceFactory.createDataSource(
                    createHikariDataSource(),
                    configs,
                    new Properties());
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }
    
    private HikariDataSource createHikariDataSource() {
        HikariDataSource result = new HikariDataSource();
        result.setDriverClassName("com.mysql.jdbc.Driver");
        result.setJdbcUrl("jdbc:mysql://localhost:3306/sakila?useSSL=false");
        result.setUsername("root");
        result.setPassword("root");
        return result;
    }
}

Here is my test code what I used. Sorry for I am not used to Java tests.

@strongduanmu
Copy link
Member

Thank you for your feedback, I will check this exception again.

@strongduanmu strongduanmu self-assigned this Aug 23, 2023
@strongduanmu
Copy link
Member

Hi @134130, I have test this case with 5130950, and it can work now. Besides, I do some change for your test case, since from 5.4.0, ShardingSphere will not load single tables by default.

    private DataSource createDataSource(Collection<RuleConfiguration> configs) {
        try {
            Collection<RuleConfiguration> allConfigs = new LinkedList<>(configs);
            SingleRuleConfiguration singleRuleConfig = new SingleRuleConfiguration();
            singleRuleConfig.setTables(Collections.singletonList("*.*"));
            allConfigs.add(singleRuleConfig);
            return ShardingSphereDataSourceFactory.createDataSource(
                    createHikariDataSource(),
                    allConfigs,
                    new Properties());
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }
    }

@strongduanmu strongduanmu added this to the 5.4.1 milestone Aug 23, 2023
@strongduanmu
Copy link
Member

Since this bug has been fixed in master branch 5130950, I will close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants