Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap 2021 #537

Closed
27 tasks done
zhicwu opened this issue Jan 5, 2021 · 14 comments
Closed
27 tasks done

Roadmap 2021 #537

zhicwu opened this issue Jan 5, 2021 · 14 comments

Comments

@zhicwu
Copy link
Contributor

zhicwu commented Jan 5, 2021

0.2.x

Focus on bug fixes, small enhancements, and backward compatibility...
  • 0.2.5
    • switch to github actions for consistency
    • use testcontainer for integration test
  • 0.2.6
    • enable retry for idempotent queries(as workaround of host failed to respond)
    • new sql parser
    • use basic auth instead of query parameters for authentication
  • 0.2.7 - TBD(in case any critical issue)

0.3.x

Focus on new features, code clean up, and abstraction which may break existing interfaces/APIs...

Previous releases...
  • 0.3.0
    • BREAKING CHANGE: drop JDK7 support
    • BREAKING CHANGE: remove Guava dependency (UnsignedLong is removed, please use long(faster) or BigInteger(slower) instead for UInt64)
      ~~Note: shaded jar is now ~3.65MB(was 7.19MB in 0.2.6, and 5.68MB in 0.2.4).~~
    • JDBC 4.2 support
    • more data types (including aliases) like IPv4, IPv6, DateTime64, *Int128, *Int256, Decimal256 and Map
      Note: UInt128 will be supported soon on server side.
    • RoaringBitmap support - please use latest RoaringBitmap
    • restructure code (clickhouse-jdbc for JDBC compliance, and clickhouse-*client for efficiency and consistent behaviors like any other clickhouse client, see Proposal of restructuring code (RFC) #570)
    • performance test (clickhouse-jdbc vs. clickhouse4j vs. clickhouse-native-jdbc vs. mariadb-java-client)
    • CI enhancement: checkstyle, spellcheck & SonarCloud
  • 0.3.1
Ongoing releases...
  • 0.3.2
    • JPMS support along with multi-release jars
       19M	target/clickhouse-jdbc-0.3.2-SNAPSHOT-all.jar
       18M	target/clickhouse-jdbc-0.3.2-SNAPSHOT-grpc.jar
      664K	target/clickhouse-jdbc-0.3.2-SNAPSHOT-http.jar
      960K	target/clickhouse-jdbc-0.3.2-SNAPSHOT-javadoc.jar
      2.7M	target/clickhouse-jdbc-0.3.2-SNAPSHOT-shaded.jar
      428K	target/clickhouse-jdbc-0.3.2-SNAPSHOT.jar
      
    • introduce abstract module clickhouse-client, experimental clickhouse-grpc-client, and HttpURLConnection-based clickhouse-http-client
    • named parameter support(only available in clickhouse-client)
    • support RowBinary* format and more data types(Geo types, Date32, Tuple, Nested, mixed use of Array/Tuple/Map etc.)
    • new JDBC driver(com.clickhouse.jdbc.ClickHouseDriver) built on top of clickhouse-client
      Note: both old and new drivers will co-exist in 0.3.x series and the old one will be removed starting from 0.4.
    • show schema of remote datasources(when JDBC bridge is available)
    • fix timezone and DateTime64 related issues
    • adaptive integration test against local testcontainer or a remote server, and categorize cases under different groups
    • replace jackson-databind and jackson-core by gson
    • enhance benchmarks to cover most JDBC drivers and data types
    • alternative implementation for http(s) protocol(JDK 11 HttpClient)
@zhicwu zhicwu pinned this issue Jan 5, 2021
@enqueue
Copy link
Contributor

enqueue commented Jan 5, 2021

Hi @zhicwu , I am excited to see a renewed activity in this project. I think it is important that the official JDBC driver is maintained. Please let me know if there is anything I can help with.

@zhicwu
Copy link
Contributor Author

zhicwu commented Jan 6, 2021

Thanks @enqueue, looking forward to work with you to carry on the development :) I'm focusing on 0.2.5 at this point. I think we can resume the work for JDBC 4.2 support. I saw the PR pending for a while, and I hope you don't mind to spend some more time on that PR by merging changes from 0.2.x and make it part of 0.3.0.

@enqueue
Copy link
Contributor

enqueue commented Jan 6, 2021

No need to hurry regarding new features. I will try to to help and take a dab at the items for 0.2.5 if you don't mind.

@zhicwu
Copy link
Contributor Author

zhicwu commented Jan 6, 2021

No need to hurry regarding new features. I will try to to help and take a dab at the items for 0.2.5 if you don't mind.

Sure. Feel free to ping me on Telegram :)

@kiwimg
Copy link

kiwimg commented Jan 25, 2021

Roadmap is very nice,

@enqueue
Copy link
Contributor

enqueue commented Feb 6, 2021

Thanks for your effort, there sure is a lot on the roadmap for this year, and you already delivered some bugfixes and improvements! Most of the items sound very reasonable. There are three topics I would like to see represented a little bit more:

JDBC compatibility

This is a JDBC driver, not just a Java library for talking more easily to ClickHouse. We should strive for better JDBC compatibilit. I think that @serge-rider might have some valuable input here, too.

  • Review behavior of all JDBC implementing classes. I see a lot of "dummy" method implementations
  • Review use of exceptions: There are a couple of unchecked Exceptions being thrown in places where the API mandates the use of SQLException. Is there some rationale behind this?
  • Try to run some of the "official" JDBC tests. Eclipse foundation has released the Java EE TCK, which includes a couple of JDBC tests. I think that testing our driver against this could give us more confidence. We will not be able to get full compliance, though.
  • Review data types and their mappings against JDBC spec and other driver implementations. Plus ensure that driver is robust with regards of new data types being introduced in ClickHouse.
  • Decide on how we want to treat concepts schema, catalog
  • Implement Structured Types, e.g. for mapping JSON stuff
  • Review scalar functions compatibility and escape syntax (JDBC 4.2 spec, section 13.4.1 + Appendix C). I do not think that this is widely popular in ClickHouse userbase.

Cleaner code

The code has seen a couple of authors, each one of them using an individual style, and having different preferences regarding patterns. Bringing more consistency into this would help everybody to understand the code better and improve quality. Here are a couple of ideas:

  • Review code, remove unused private methods etc.
  • Install a checkstyle template (or something similar), which could check formatting as well as import order etc. (yes, I am guilty...)
  • Provide doc / template for Eclipse, VS Code for import order, code format.
  • Consider enabling -Werror
  • Run the code through Spotbugs general (or something comparable) and fix the issues
  • Create some doc about the expected code style, naming conventions (e.g. ClickHouse versus Clickhouse in class names) and which patterns to avoid.
  • After official migration to Java 8 code base, we can use simpler syntax in some places without affecting behavior, e.g. use diamond, lambdas in tests (assertThrows(IAE.class, () -> doSomething()))
  • Harmonize integration test code. Almost every test does something similar "in its own way". I think that ClickHouseContainerForTest is a great start, perhaps we could achieve better clarity by sticking to one pattern or even providing an AbstractClickHouseIntegrationTestCase 😉

Control non-JDBC API

The driver also provides some proprietary API, and this obviously has its benefits, too. However, I have seen some situations in which it was not clear, whether a method on e.g. ClickHouseConnectionImpl was public on purpose. We must make sure that the public API is under control and production ready.

  • Review existing public methods: Do we want to expose them as API? If not, deprecate them.
  • Provide (and publish) JavaDoc for public API.
  • Consider tightening control using Java module to avoid public methods leaking into API.

Performance

The other community members are much more knowledgeable about performance than me, here is just a simple observation:
ResultSet is actually made for scrolling, i.e. the driver does not need to hold the whole result in memory. The first results could be available earlier and the memory does not have to explode. Some of the "big boys" use paging in the background. I do not know how to implement this using HTTP client and ClickHouse server, but there is probably a smart person out there who does.

Perhaps we can discuss the items together, drop the ones which are not interesting to the community, create tickets for the remaining ones and bring them in order according to this roadmap. Do you see a chance for a short chat or virtual meeting with the major stakeholders and users?

@zhicwu
Copy link
Contributor Author

zhicwu commented Feb 8, 2021

These are great inputs! IMO, put the repository name aside, clickhouse-jdbc is both JDBC driver and Java client for ClickHouse :) Implementing JDBC interfaces simplifies integration, while as a Java client, it should stick with the clickhouse-way - simple to use and efficient in runtime. Alternatively, we can simply treat it as a JDBC wrapper over Java client for ClickHouse, meaning we can choose suitable interface to talk to ClickHouse using JVM languages, depending on what we need.

I'm with you on JDBC compliance. That's why I created #545 and suggested to introduce a new setting. On the other hand, we probably don't have to completely follow JDBC standards - not only because ClickHouse has its own limitation(like transaction support, server-side cursor etc.), but also to save time so that we can focus on things that matter the most(like more data types support, streaming, caching, stability etc.). It seems we'd better restructure the code early using multiple modules/packages.

I think we can start to create more issues/feature requests and add them into the release plan, starting from 0.3.0. I'm sure we can fill the gap between 0.4 and 1.0 quickly ;) And of course it's a plan that we can change as needed.

Some more comments:

  • JDBC compliance
    1. Agree that error handling and empty implementation should be improved.
    2. Is eclipse's TCK based on the one created by Sun? If it works on JDK8, we should definitely add a new github workflow for that.
    3. Yes we can review data type mappings as well as concepts like catalog/schema, and get them documented.
    4. I saw 4.2 but what about 4.3? Also I'm thinking to break the large PR into several smaller PRs, so that it's easier to review and we can deliver them gradually in multiple releases.
  • Cleaner code
    1. We have github actions for CI/CD but they're far from good. Yes, we do need to apply checkstyle template, coverage analysis and benchmarking.
    2. RE: integration test, since we're using TestNG, I guess we just need to categorize them under different groups. And of course a base class is helpful, so that we don't have to add ClickHouseContainerForTest.beforeSuite() in setUp method ;)
  • Control non-JDBC API
    1. Yes, we need to be more careful on modifiers. We should avoid case like Restore public getColumnNames method #499.
    2. I didn't see issue related to OSGi/Java9 but we can use package/maven module to separate interface and implementation.
    3. Besides a unified client API for hiding protocol details, we may still need to expose API for specific client like http/grpc/native for better performance/usability.
  • Performance
    1. ClickHouse feeds response in chunks, which in a way is a forward-only "cursor" that we can use.
    2. Benchmark should be added to measure hot areas, so that it's clear if a PR will improve performance or not.
    3. Thought about voting but now I'd say requirements came from github issues and Telegram/Slack :) Definitely good idea to setup a meeting to go over the list and exchange thoughts before 0.3.0. Give me a few days to digest all these information and let me see what I can do to kick-off future development.

@Veryfirefly
Copy link

hi, i look forward to contributing to clickhouse-jdbc, but I am not quite sure about the content of the work that needs to be done at present. I didn't find any springboot-clickhouse-jdbc-starter repository on github, so I implemented one easily. Is there any plan for future planning? Or is there anything I can help?

@zhicwu
Copy link
Contributor Author

zhicwu commented Jul 19, 2021

Sorry for the late response @Veryfirefly. Of course you can help :) I'm still in the middle of refactoring, but I think you can start with unassigned issues. Alternatively, if you have some new ideas regarding this driver, or Java client to be more specific, please feel free to create new issue for discussion and then submit a pull request for implementation.

@Veryfirefly
Copy link

Sorry for the late response @Veryfirefly. Of course you can help :) I'm still in the middle of refactoring, but I think you can start with unassigned issues. Alternatively, if you have some new ideas regarding this driver, or Java client to be more specific, please feel free to create new issue for discussion and then submit a pull request for implementation.

Thanks, how to claim an issues, i am not very clear , sorry...

@zhicwu
Copy link
Contributor Author

zhicwu commented Jul 21, 2021

Thanks, how to claim an issues, i am not very clear , sorry...

That's fine. Ping me (zhicwu) on Telegram/Slack and we can discuss more there.

@kiwimg
Copy link

kiwimg commented Oct 31, 2021

When will 0.3.2 be released,
in addition
Can bitmaps support bulk inserts?

@zhicwu
Copy link
Contributor Author

zhicwu commented Nov 1, 2021

When will 0.3.2 be released

It's taking much longer time than I thought. I don't have an exact ETA at this point, but I'll create a few "beta" releases for public testing, after getting the new driver merged.

Can bitmaps support bulk inserts?

Are you using jdbc batch insertion? You can use extended API before 0.3.2. Feel free to open a new issue for discussion.

@zhicwu zhicwu unpinned this issue Dec 29, 2021
@zhicwu
Copy link
Contributor Author

zhicwu commented Dec 29, 2021

Close this one and move unfinished items to #784.

@zhicwu zhicwu closed this as completed Dec 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants