Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

siddhi how to confirm the data availability? #1450

Closed
xywan89 opened this issue Aug 16, 2019 · 11 comments
Closed

siddhi how to confirm the data availability? #1450

xywan89 opened this issue Aug 16, 2019 · 11 comments

Comments

@xywan89
Copy link

xywan89 commented Aug 16, 2019

now i deploy a siddhi cluster on some docker,and will restart the cluster frequently(at least one or more per day).my question is when i restart an docker, if will loss data or not ?

if will loss data,how can i do to confirm the data availability?

@BuddhiWathsala
Copy link
Contributor

Yes if you deploy Siddhi in default way you will lose the data. You can enable state persistence in two ways.

  1. File system persistence
  2. DB persistence

To persist data in a file system you have to do the following.

  1. Create a directory(<PATH_TO_TEMP>/temp) to persist the state
  2. Then, you need to create a YAML file with the following content to enable state persistence in file system mode. Let say that file is config.yaml.
    state.persistence:
    enabled: true
    intervalInMin: 1
    revisionsToKeep: 2
    persistenceStore: io.siddhi.distribution.core.persistence.FileSystemPersistenceStore
    config:
        location: siddhi-app-persistence
  3. Then you have to run the docker using the following command. This command will create a volume mount to the /conf/config.yaml the directory inside the docker and using that file Siddhi runner changes its default config. This config change enables periodic state persistence.
    docker run -v <PATH_TO_TEMP>/temp:/home/siddhi_user/siddhi-runner/wso2/runner/siddhi-app-persistence -v <PATH_TO_CONFIG_YAML>/config.yaml:/conf/config.yaml  -v <PATH_TO_SIDDHI_APPS>/PowerConsumptionSurgeDetection.siddhi:/siddhi/PowerConsumptionSurgeDetection.siddhi -p 8070:8070 siddhiio/siddhi-runner-ubuntu:5.1.0-m2  -Dconfig=/conf/config.yaml -Dapps=/siddhi/PowerConsumptionSurgeDetection.siddhi

This will persist the state to your <PATH_TO_TEMP>/temp directory. To enble database persistence use following YAML block. Then you have to connect a DB to the Siddhi runner docker.

state.persistence:
  enabled: true
  intervalInMin: 1
  revisionsToKeep: 3
  persistenceStore: io.siddhi.distribution.core.persistence.DBPersistenceStore
  config:
    datasource: <DATASOURCE NAME>   # A datasource with this name should be defined in wso2.datasources namespace
    table: <TABLE NAME>

Please refer to Siddhi documentation for more details.
[1] https://siddhi.io/en/v5.0/docs/siddhi-as-a-docker-microservice/#running-with-runner-config
[2] https://siddhi.io/en/v5.0/docs/config-guide/#configuring-periodic-state-persistence

@cristicmf
Copy link

If I didn't want to use the docker or Kubernetes , How Can I Cover the Multi Datacenter High Availability Deployment

@BuddhiWathsala
Copy link
Contributor

@cristicmf, currently Siddhi distribution does not support that HA deployment without docker or K8s.

But if you really need this HA feature you can try out the HA functionality in our stream processor. Please refer this link to get more idea about HA deployment in the stream processor.

@cristicmf
Copy link

cristicmf commented Aug 21, 2019

@cristicmf, currently Siddhi distribution does not support that HA deployment without docker or K8s.

But if you really need this HA feature you can try out the HA functionality in our stream processor. Please refer this link to get more idea about HA deployment in the stream processor.

thx ~~ And I want know more thing about the scalability , can you give me some tips.

@BuddhiWathsala
Copy link
Contributor

In our stream processor, we are supporting the following deployment types.

  1. HA deployment as I mentioned above.
  2. Fully distributed deployment.
  3. Multi datacenter HA deployment.

Now we have separate runtime called Siddhi runner which is a very light environment to run streaming logic. All the deployment types now managed using docker and K8s.

You can find out more details about docker deployments in Siddhi using following links.
[1] https://github.com/siddhi-io/docker-siddhi
[2] https://hub.docker.com/u/siddhiio

For K8s deployment, we have custom K8s operator called Siddhi operator. Up to now, Siddhi operator supports default distributed deployment. We are working on with the fully distributed deployment for K8s now. You can find out K8s deployment details from the following link.

[3] https://github.com/siddhi-io/siddhi-operator

Try this Katacoda samples about each deployment type in K8s.

Also, refer to the Siddhi documentation for more descriptive details.

@xywan89
Copy link
Author

xywan89 commented Aug 23, 2019

refer this link

i run siddhi as a java library,it seems all way mentioned above can not cover,is there any way to confirm ?

HA mode need duplicate one instance, it's not a good way to solve my restart scenario!!

@BuddhiWathsala
Copy link
Contributor

@xywan89 you can achieve in-memory persistence and file store persistence using Siddhi Java library. Please refer following test cases to understand how to persist your state.

  1. In-memory
  2. File system

@xywan89
Copy link
Author

xywan89 commented Aug 23, 2019

2. File system

tks,it seems java documentaion is insufficient, look forward to your complement。

what more about persistance? such as the difference between persistance and IncrementalPersistence ?

@xywan89
Copy link
Author

xywan89 commented Aug 27, 2019

@xywan89 you can achieve in-memory persistence and file store persistence using Siddhi Java library. Please refer following test cases to understand how to persist your state.

  1. In-memory
  2. File system

the other issue is can i limit the resource used by siddhi runtime when i use siddhi as a java library,such as limit memory cost?

@BuddhiWathsala
Copy link
Contributor

  1. File system

tks,it seems java documentaion is insufficient, look forward to your complement。

what more about persistance? such as the difference between persistance and IncrementalPersistence ?

Sorry for the late reply. In the state persistence, it simply persists the overall state of the current checkpoint of the application. For example, let say your application is in checkpoint X1 then it will persist X1 to the file system. When you move to the X2 checkpoint, then again Siddhi will persist overall X2 state in the file system. However, this would be a redundant persistent mechanism when your application has Giga bites of data and you only do small changes to the application.

For that kind of scenario, you can use incremental persistence. Incremental persistence uses incremental checkpointing. In there Siddhi will only persist the changes(or delta) instead of persisting overall state.

@mohanvive
Copy link
Contributor

Closing the issue since query is answered. Please reopen if u need further assistance on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

4 participants