Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-7301] [docs] Rework state documentation #4441

Closed
wants to merge 2 commits into from

Conversation

twalthr
Copy link
Contributor

@twalthr twalthr commented Jul 31, 2017

What is the purpose of the change

This PR restructures state related documentation pages. It introduces some state introduction page and moves some files (from setup/ to ops/) according to the new documentation structure.

Brief change log

Documentation changes only.

Verifying this change

Built with built script and links checked.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@twalthr
Copy link
Contributor Author

twalthr commented Jul 31, 2017

CC @alpinegizmo

Copy link
Contributor

@alpinegizmo alpinegizmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. But we should add redirects for all the pages being moved, which I believe is the list below. I know I have linked to a bunch of these pages from answers on stackoverflow and from the training site, and I imagine there are links elsewhere on the web.

dev/stream/checkpointing.html
dev/stream/state.html
dev/stream/queryable_state.html
monitoring/large_state_tuning.html
ops/state_backends.html
setup/aws.html
setup/building.html
setup/checkpoints.html
setup/cli.html
setup/checkpoints.html
setup/cluster_setup.html
setup/config.html
setup/deployment.html
setup/docker.html
setup/flink_on_windows.html
setup/gce_setup.html
setup/index.html
setup/jobmanager_high_availability.html
setup/kubernetes.html
setup/mapr_setup.html
setup/mesos.html
setup/savepoints.html
setup/security-ssl.html
setup/yarn_setup.html

under the License.
-->

If your application uses Flink's managed state, it might be necessary to implement a custom serialization logic for special use cases.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop the word "a" in "implement a custom serialization logic" so that it reads "implement custom serialization logic"

Note that Flink writes state serializers along with the state as metadata. In certain cases on restore (see following
subsections), the written serializer needs to be deserialized and used. Therefore, it is recommended to avoid using
anonymous classes as your state serializers. Anonymous classes do not have a guarantee on the generated classname,
varying across compilers and depends on the order that they are instantiated within the enclosing class, which can
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"varying across compilers and depends" ==> "which varies across compilers and depends"

- When training a machine learning model over a stream of data points, the state holds the current version of the model parameters.
- When historic data needs to be managed, the state allows efficient access to events occured in the past.

Flink needs to be aware of the state in order to make state fault tolerant using [checkpoints](checkpointing.html) and allow [savepoints]({{ site.baseurl }}/ops/state/savepoints.html) of streaming applications.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"and to allow [savepoints]"

@twalthr
Copy link
Contributor Author

twalthr commented Aug 3, 2017

@alpinegizmo I thought about adding redirects, but we would be in redirect hell if we would add every single page in the future. Actually, only links to the master docs change and we should not use links to master docs in trainings/stackoverflow anyway. Proper links to released docs remain unchanged.

@alpinegizmo
Copy link
Contributor

@twalthr Duh, of course, you're right.

+1

@twalthr
Copy link
Contributor Author

twalthr commented Aug 8, 2017

Thanks @alpinegizmo. I will merge this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants