Refactor duplication in UpdateLog.init#2491
Conversation
with HdfsUpdateLog.init
|
|
||
| usableForChildDocs = core.getLatestSchema().isUsableForChildDocs(); | ||
|
|
||
| try { |
There was a problem hiding this comment.
this is just moved code, but the diff can't show that
| log.info("Initializing HdfsUpdateLog: tlogDfsReplication={}", tlogDfsReplication); | ||
| } | ||
|
|
||
| private Configuration getConf(Path path) { |
There was a problem hiding this comment.
simply re-ordered this below when it's called
| ll.closeOutput(); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
this was all that duplicated code, the point of this refactoring/PR
|
|
|
|
bruno-roustant
left a comment
There was a problem hiding this comment.
LGTM.
It was hard to follow the moves, but the resulting code is clearer.
| // moving the tlog dir on reload | ||
| if (fs == null) { | ||
| // ulogDir from CoreDescriptor overrides | ||
| String ulogDir = core.getCoreDescriptor().getUlogDir(); |
There was a problem hiding this comment.
Should we use this.dataDir instead? (set to core.getUlogDir() in super.init())
There was a problem hiding this comment.
(note I didn't write this code; I just moved these 2 lines)
I think the logic around dataDir / ulogDir resolution is pretty confusing; I'm not comfortable improving this TBH without a bigger investment it than my goal right now. Note init() can be called additional times for core reloads so the dataDir will hold whatever it had from before. It appears that's pertinent.
@magibney touched similar code recently; maybe would offer input
There was a problem hiding this comment.
There's some weird overlap between this PR and #1895 -- it's taking me a while to make sense of the relationships here, so thanks for bearing with me 😬 .
For all the work I've put into this area of the code recently, it's still quite confusing, and I'm hoping that this PR will help clarify things. I hesitate to suggest, but I'm wondering if it might be better to incorporate this refactor into #1895, since I'm not sure it makes sense to do a rote refactor of code that is iirc basically buggy. I'm taking a stab at what that would look like now, mainly to better understand the code.
I'll follow up soon.
There was a problem hiding this comment.
A rote refactor is safe no matter if the code was buggy or not; no? Any way, if you're doing different things here such that the two methods I pulled out are not needed (yes / no?) then I should not do this change at this time. I saw code duplication and aimed to address it. It's at the end of the init method; same code, albeit not char-for-char. This PR isn't super important to me; no rush.
There was a problem hiding this comment.
A rote refactor is safe no matter if the code was buggy or not; no?
💯 yeah, I wasn't suggesting it wasn't safe, and I think this PR was an improvement.
962b99f applies the spirit of this PR to #1895, and I'm pretty happy with the results (it forced me to confront some of the duplication that had been a lingering source of confusion).
with HdfsUpdateLog.init
Should just be a wrote refactor here. UpdateLog.init does a bunch of stuff, and HdfsUpdateLog.init does even more stuff yet some same stuff too (without calling super.init). So I extracted out a couple aspects so that Hdfs can extend those parts to provide its own logic.