Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-1442] Reduce memory consumption of archived execution graph #344

Closed
wants to merge 1 commit into from

Conversation

mxm
Copy link
Contributor

@mxm mxm commented Jan 26, 2015

No description provided.

@StephanEwen
Copy link
Contributor

Looks good so far. I see that you removed the LRU code. Was that on purpose?

Leaving it in may be a good idea, because the soft references are cleared in arbitrary order. It may make newer jobs disappear before older ones. Having the LRU in would mean things behave as previously as long as the memory is sufficient, and the soft reference clearing kicks in as a safety valve.

@mxm
Copy link
Contributor Author

mxm commented Jan 27, 2015

@StephanEwen Yes, that was on purpose. The previous two data structures (HashMap and Queue) are now replaced by the LinkedHashMap which serves the same functionality. It might not be obvious but the LinkedHashMap preserves the order of the inserted items. From scala.collection.mutable.LinkedHashMap:

This class implements mutable maps using a hashtable.
The iterator and all traversal methods of this class visit elements in the order they were inserted.

That's why graphs.iterator.next() always returns the least recently inserted item.

* @param jobID
* @return ExecutionGraph or null
*/
def getGraph(jobID: JobID) = graphs.get(jobID) match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing that? Why not working on the Option type? I don't like null.

@StephanEwen
Copy link
Contributor

I have actually merged this branch locally and already addressed most of the issues you found

val iter = graph.getAllExecutionVertices().iterator()
while (iter.hasNext) {
iter.next().clearExecutionEdges()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using Scala power: graph.getAllExecutionVertices.asScala.foreach { _.clearExecutionEdges } with import scala.collection.JavaConvertes.iterableAsScalaIterableConverter in scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much nicer! Thanks.

@StephanEwen
Copy link
Contributor

You can also shoot yourself in the foot with Option, no? There is no substitute for thorough programming...

private def cleanup(jobID: JobID): Unit = {
while (graphs.size > max_entries) {
// get first graph inserted
val (jobID, value) = graphs.iterator.next()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

graphs.head equivalent to graphs.iterator.next() and shorter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@tillrohrmann
Copy link
Contributor

It is true that thorough programming is always the best, but Option encourages you to apply a more functional programming style which in many cases results in safer code. For example, Scala warns you if the pattern matching on Option is not exhaustive, whereas this is not the case for arbitrary classes and null values.

@tillrohrmann
Copy link
Contributor

Ah ok perfect that you already addressed the minor issues.

StephanEwen pushed a commit to StephanEwen/flink that referenced this pull request Feb 4, 2015
StephanEwen pushed a commit to StephanEwen/flink that referenced this pull request Feb 5, 2015
StephanEwen pushed a commit to StephanEwen/flink that referenced this pull request Feb 5, 2015
@asfgit asfgit closed this in 9d181a8 Feb 5, 2015
fhueske pushed a commit to fhueske/flink that referenced this pull request Feb 5, 2015
@StephanEwen
Copy link
Contributor

I addressed the issues already in a cleanup commit built on top of #317 . See 56b7f85

@hsaputra
Copy link
Contributor

hsaputra commented Feb 5, 2015

Awesome! Thanks for the update.

marthavk pushed a commit to marthavk/flink that referenced this pull request Jun 9, 2015
zhijiangW pushed a commit to zhijiangW/flink that referenced this pull request Jul 7, 2020
Co-authored-by: morsapaes <marta.paes.moreira@gmail.com>

This closes apache#344.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants