Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/1598 MySQL implement connection pooling #1600

Merged
merged 11 commits into from Mar 22, 2019

Conversation

pmo-sdr
Copy link
Collaborator

@pmo-sdr pmo-sdr commented Mar 13, 2019

Connect MySQL database using pooling technique.
Code was changed to meet CheckStyle rules and fix some issues.

#1598
#1597

.gitignore Outdated
cygnus-ngsi/ignore
.DS_Store .DS_Store
archived/ archived/
cygnus-ngsi/ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep previous version of this file

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done a76d3c5

LOGGER.debug("Number of cached mysql connections: " + connections.size());
DataSource datasource = createConnectionPool(dbName);
connections.put(dbName, datasource);
connection = datasource.getConnection();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Al principio o al final de este metodo pondria una traza con el tamaño de connections

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agree with this, modified in 102e8f1

// Check Pool cache and log status
if (pools.containsKey(dbName)){
GenericObjectPool pool = pools.get(dbName);
LOGGER.debug("Pool status (" + dbName + ") Max.: " + pool.getMaxActive() + "; Active: "
+ pool.getNumActive() + "; Idle: " + pool.getNumIdle());
}else{
LOGGER.error("Can't find dabase in pool cache (" + dbName + ")");
}

// FIXME: the number of cached connections should be limited to a certain number; with such a limit
// number, if a new connection is needed, the oldest one is closed
Connection con = connections.get(dbName);
// FIXME: the number of cached connections should be limited to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Este comentario de FIXME entiendo que ya se podria borrar, ¿no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of connections per database, is now limitted by the pool, but,
Are there any behaviour within we need to limit the number of databases (pools)?

ie, limit number when Cygnus tries to create a new database per entity type, or something like that.

If this is the case, It would be interesting to implement it in this PR.

@@ -235,6 +235,7 @@ A configuration example could be:
cygnus-ngsi.sinks.mysql-sink.mysql_port = 3306
cygnus-ngsi.sinks.mysql-sink.mysql_username = myuser
cygnus-ngsi.sinks.mysql-sink.mysql_password = mypassword
cygnus-ngsi.sinks.mysql-sink.mysql_maxPoolSize = 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A parte de incluirlo en la lista de parametros para el sink habria que añadirlo a la tabla con su descripcion:
https://github.com/telefonicaid/fiware-cygnus/blob/master/doc/cygnus-ngsi/flume_extensions_catalogue/ngsi_mysql_sink.md#configuration

Asi como a todo lo relacionado con docker:
Configuración por defecto: https://github.com/telefonicaid/fiware-cygnus/blob/master/docker/cygnus-ngsi/agent.conf#L46
Entry point (lectura de ENV vars): https://github.com/telefonicaid/fiware-cygnus/blob/master/docker/cygnus-ngsi/cygnus-entrypoint.sh#L68

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -1,3 +1,4 @@
- [cygnus-common][MySQLBackend] Implement connection pooling (#1600)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that 1600 is a self-reference to the PR itself. Actually, I think you should include references to the issues this PR is fixing, I mean #1598 and #1597 (onel ine per issue).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done f014b0b

- [cygnus-common][MySQLBackend] Implement connection pooling (#1598, #1597)

private final HashMap<String, Connection> connections;

private final HashMap<String, DataSource> connections;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cache should be named "datasources" instead of "connections"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sink stop  method now releases MySQL resources.
Added some debug info to the logs.
private static final String DEFAULT_PORT = "3306";
private static final String DEFAULT_HOST = "localhost";
private static final String DEFAULT_USER_NAME = "root";
private static final int DEFAULT_MAX_POOL_SIZE = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Por defecto será 1? Tiene sentido hacer un pool para solo 1 elemento?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Molaría que fuera configurable en vez de una constante en código, sí...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fgalan las constantes son solamente los valores por defecto si no lo encuentra en el fichero de configuración del agente, es configurable.

@AlvaroVega Delegamos en el driver de Pooling toda la gestión de mantener la conexión activa y liberarla cuando se termina de usar. Hemos preferido utilizar el driver de Apache que será mas robusto que un desarrollo nuevo.

Durante las pruebas nos ha parecido que Flume no lanza consultas concurrentes a base de datos, creemos que la concurrencia se realiza a nivel de Sink, con un hilo por Sink, y al tener un pool por base de datos con un tamaño de 1 sería suficiente.

La idea a futuro es que ya que está la posibilidad de pooling, cuando se utilicen las características de balanceo de Sinks que tiene el agente Flume, creando varios sinks iguales por agente, hacer que los MySQLSinks balanceados compartan driver, y por tanto ahí si que tendría sentido configurar un pool de mayor tamaño.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Si es así entonces el pool no crecería de tamaño, quedando se en 1, a pesar de que pudiera tener un maximo mayor (con 5 ya me quedaba tranquilo), no?
En cualquier caso anotaria otro FIXME comentando el porque y la idea futura.

Si no es asi, y el valor de maxPoolSize debe ser siempre 1 como se propone en la documentación (don't change this parameter) entonces ya no es un parametro y no debe ser configurable, no?

Copy link
Collaborator Author

@pmo-sdr pmo-sdr Mar 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1601 Estamos revisando ampiar la funcionalidad de balanceo de carga para explotar todas las posibilidades que nos ofrece el pooling, entre ellas que la posibilidad de utilizar pools de mayor tamaño.

En cualquier caso, configurar maxPoolSize>1 no tiene efectos negativos en el rendimiento, ni desperdicia recursos, ya que el pool no crece si no tiene dos solicitudes de conexión simultaneas:

time=2019-03-14T15:07:11.645Z | lvl=DEBUG | corr=9ca1af4c-466c-11e9-b51f-fa163e7fb904 | trans=7b2fdf84-d261-4b2e-beba-fd30df6d0406 | srv=dbName| subsrv=/subservice| comp=cygnusagent | op=getConnection | msg=com.telefonica.iot.cygnus.backends.mysql.MySQLBackendImpl$MySQLDriver[473] : Pool status (dbName) Max.: 5; Active: 1; Idle: 0

Se puede ver que a pesar de tener un máximo de 5 y estar utilizando una, no hay ninguna conexión inactiva.

import javax.sql.rowset.CachedRowSet;

import org.apache.commons.dbcp.ConnectionFactory;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

¿Habria que añadir la nueva dependencia al pom.xml?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No es necesario, es la misma librería que utilizan Flume y Hive
flume-ng-node -> flume-jdbc-channel -> commons-dbcp
hive-jdbc -> hive-metastore -> commons-dbcp

.gitignore Outdated
cygnus-ngsi/.project
cygnus-twitter/.classpath
cygnus-twitter/.project
telefonica_checkstyle.xml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

telefonica_checkstyle.xml shouldn't be ignored, it's an artifact that we want to keep in repository

Can you elaborate on the rest of addition to this file, please? .gitignore should be keep to the minimum list and each new addition should be carefully analyzed.

Copy link
Collaborator Author

@pmo-sdr pmo-sdr Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, .gitignore wasn't supposed to be part of the commit. Our local telefonica_checkstyle.xml has changes to get it working with our development IDE. Was added to local gitignore to avoid commiting this unwanted changes.

This changes have been reverted in 7d08b4c

Copy link
Member

@fgalan fgalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with regards to the couple of comments I provided. Thanks for the work!

If @AlvaroVega comments has been also addressed I think the PR is ready to be merged. Passing the ball to him for additional LGTM and final merge.

@@ -210,6 +210,7 @@ If `attr_persistence=colum` then `NGSIMySQLSink` will persist the data within th
| mysql\_port | no | 3306 ||
| mysql\_username | no | root | `root` is the default username that is created automatically |
| mysql\_password | no | N/A | Empty value as default (no password is created automatically) |
| mysql\_maxPoolSize | no | 1 | Max number of connections per database pool (don't change this parameter) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yo aquí elminaría la coletilla de "(don't change this parameter)". Ya hemos dicho que no pasa nada por tener un valor mayor. Y además me parece que ahora por defecto el valor es 2. Haria mencion además a las futuras capacidades de loadbalancing que se supone que van a hacer uso de esta funcionalidad

@AlvaroVega AlvaroVega merged commit 872c93e into master Mar 22, 2019
@fgalan fgalan deleted the feature/1598_MySQL-Implement-Connection-Pooling branch March 22, 2019 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants