Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traffic Ops Golang parent.config #3075

Merged
merged 26 commits into from May 31, 2019
Merged

Conversation

rob05c
Copy link
Member

@rob05c rob05c commented Dec 3, 2018

What does this PR do?

Traffic Ops Golang parent.config

Still WIP. I've tested against a single edge and mid; still need to test against a large number of configurations of edges and mids.

Removing WIP - I think this is good to be merged. I've manually diffed versus the old Perl parent.config, against every edge and mid in our production CDN.

The only differences vs Perl, are where duplicate origins exist with different configurations (a data bug), and Perl arbitrarily selects one with no warning or error. Go now logs an error when that happens. Unfortunately, Perl is ordering them by an internal Perl hash of the object, which isn't feasible to replicate in Go. But again, it's a data bug to have that scenario at all.

Fixes #3071 in 48ba440
Does not fix #2725 - that's a much more invasive fix, I think should be a separate PR.

Includes API tests.

Which TC components are affected by this PR?

  • Documentation
  • Grove
  • Traffic Analytics
  • Traffic Monitor
  • Traffic Ops
  • Traffic Ops ORT
  • Traffic Portal
  • Traffic Router
  • Traffic Stats
  • Traffic Vault
  • Other _________

What is the best way to verify this PR?

Check all that apply

  • This PR includes tests
  • This PR includes documentation updates
  • This PR includes an update to CHANGELOG.md
  • This PR includes all required license headers
  • This PR includes a database migration (ensure that migration sequence is correct)
  • This PR fixes a serious security flaw. Read more: www.apache.org/security

@rob05c rob05c added new feature A new feature, capability or behavior Traffic Ops related to Traffic Ops WIP "Work-in-Progress" - do not merge! (use 'draft' pull requests from now on) labels Dec 3, 2018
@asfgit
Copy link
Contributor

asfgit commented Dec 3, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/2861/
Test PASSed.

@asfgit
Copy link
Contributor

asfgit commented Dec 3, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/2862/
Test FAILed.

@asfgit
Copy link
Contributor

asfgit commented Dec 3, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/2868/
Test FAILed.

@asfgit
Copy link
Contributor

asfgit commented Dec 3, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/2874/
Test PASSed.

@rob05c rob05c added the cache-config Cache config generation label Dec 5, 2018
@rob05c rob05c removed the WIP "Work-in-Progress" - do not merge! (use 'draft' pull requests from now on) label Dec 5, 2018
@rob05c rob05c changed the title WIP Traffic Ops Golang parent.config Traffic Ops Golang parent.config Dec 5, 2018
@asfgit
Copy link
Contributor

asfgit commented Dec 5, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/2891/
Test FAILed.

@asfgit
Copy link
Contributor

asfgit commented Dec 5, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/2892/
Test FAILed.

Copy link
Member

@ezelkow1 ezelkow1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good other than those 3 err check spots which Im not sure if they need to be moved higher or not

traffic_ops/client/atsconfig.go Outdated Show resolved Hide resolved
traffic_ops/client/atsconfig.go Show resolved Hide resolved
traffic_ops/client/atsconfig.go Show resolved Hide resolved
@asfgit
Copy link
Contributor

asfgit commented Dec 10, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/2926/
Test FAILed.

@asfgit
Copy link
Contributor

asfgit commented Dec 10, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/2927/
Test FAILed.

@asfgit
Copy link
Contributor

asfgit commented Jan 25, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/3104/
Test FAILed.

@rob05c rob05c force-pushed the to-parent-dot-config branch 2 times, most recently from 49b3b4c to 0516d71 Compare January 25, 2019 19:25
@asfgit
Copy link
Contributor

asfgit commented Jan 25, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/3106/
Test FAILed.

@rob05c
Copy link
Member Author

rob05c commented Feb 19, 2019

retest this please

@asfgit
Copy link
Contributor

asfgit commented Feb 19, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/3241/
Test FAILed.

@rob05c
Copy link
Member Author

rob05c commented Apr 26, 2019

@ocket8888 Odd, this PR shouldn't affect those files. I just rebased to the latest master, try now?

@asfgit
Copy link
Contributor

asfgit commented Apr 26, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/3591/
Test PASSed.

@ocket8888
Copy link
Contributor

Builds now. Problem was new files introduced to the repo that were untracked in your branch - caused compilation errors because of docker's recursive copy and go's implicit build system.

Copy link
Contributor

@ocket8888 ocket8888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have a couple of nitpicks and some questions, but after testing against both the default CDN-in-a-Box environment and a dump of production data this seems to work exactly as advertised.

Also, fwiw I think it's perfectly fine to log warnings even where Perl didn't - it doesn't change the logic and everywhere you put the //TODO it seemed like a good idea to me. Not gonna block on that, but if you wanted to add them in it'd be an improvement imo.

traffic_ops/traffic_ops_golang/ats/parentdotconfig.go Outdated Show resolved Hide resolved
traffic_ops/traffic_ops_golang/ats/parentdotconfig.go Outdated Show resolved Hide resolved
}
textLine += ` max_simple_retries=` + ds.MSOMaxSimpleRetries + ` max_unavailable_server_retries=` + ds.MSOMaxUnavailableServerRetries
}
textLine += "\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to instead later join the array with "\n" instead of doing this every time and joining with an empty string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. At this point, I'd really prefer not to change it. These variables are reused so much, as an artifact of Perl, I'd be afraid of accidentally changing the logic and breaking something. I added a TODO so it'll be looked at with the rest of #3515 if that's ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a bug, so I'm fine with it as long as it's captured somehow.

traffic_ops/traffic_ops_golang/ats/parentdotconfig.go Outdated Show resolved Hide resolved
@asfgit
Copy link
Contributor

asfgit commented Apr 29, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/3597/
Test PASSed.

@rawlinp
Copy link
Contributor

rawlinp commented Apr 29, 2019

Also, fwiw I think it's perfectly fine to log warnings even where Perl didn't - it doesn't change the logic

We need to be mindful of the amount of logging we add and decide whether or not it really needs to be added at all, and if it truly is needed, it should be logged at the right level. If we add unnecessary warnings to code that gets called thousands of times during a Queue Updates operation, then we now have thousands of warnings that could otherwise be drowning out actual problems that need addressed.

Basically, if a warning is just "normal operation" of the system, then it should not be logged as a warning. A warning generally means that that the system is operating sub-optimally, and that something needs addressed before the warnings start leading to errors. If someone is digging into the logs, they should be able to quickly identify issues without having to sift through a bunch of meaningless information. A good rule of thumb is to imagine you're the ops engineer that's been tasked with triaging an issue in Production and has to dig into the logs without any knowledge of the code internals. You'll want to do yourself a favor at that point.

Just my 2 cents :)

@rob05c
Copy link
Member Author

rob05c commented Apr 29, 2019

if a warning is just "normal operation" of the system, then it should not be logged as a warning. A warning generally means that that the system is operating sub-optimally, and that something needs addressed before the warnings start leading to errors

I agree. As far as I'm aware, all the warnings I added are things that should be addressed. But I could be mistaken, this code in particular is pretty easy to mistake normal operation for an issue, I'm open to changing anything that was misunderstood and acceptable as normal operation.

@asfgit
Copy link
Contributor

asfgit commented Apr 29, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/3598/
Test FAILed.

@asfgit
Copy link
Contributor

asfgit commented Apr 29, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/3599/
Test PASSed.

@asfgit
Copy link
Contributor

asfgit commented Apr 29, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/3600/
Test FAILed.

@rawlinp
Copy link
Contributor

rawlinp commented Apr 29, 2019

I think the current warnings in this PR are fine. I don't really know which warnings @ocket8888 was referring to, but from what I can remember about my initial review a long time ago, most of my comments about removing logging were for logs that seemed like informational debug logs that would only be useful for developing/testing this implementation. ¯\_(ツ)_/¯

@ocket8888
Copy link
Contributor

I was referring to logs that aren't there, there are two or three places where data is missing or malformed that just causes the current function to bail or skip the current item and he has e.g. // TODO log a warning? Perl doesn't

@rob05c
Copy link
Member Author

rob05c commented Apr 29, 2019

Yeah, those particular comments, I'm not sure whether it's normal operation, or an issue

@rob05c
Copy link
Member Author

rob05c commented Apr 29, 2019

retest this please

@asfgit
Copy link
Contributor

asfgit commented Apr 29, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/3604/
Test PASSed.

@rawlinp rawlinp dismissed their stale review May 14, 2019 17:21

I believe all my comments have been addressed so I'm dismissing my review for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cache-config Cache config generation new feature A new feature, capability or behavior Traffic Ops related to Traffic Ops
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Explicit MSO Parent Rank wrong Traffic Ops generates bad default parent.config
5 participants