Skip to content
This repository was archived by the owner on Nov 24, 2025. It is now read-only.

Fix Traffic Ops Tenancy and Activity Bugs, Fix TO API Test Framework to work with Tenancy#3163

Merged
dangogh merged 2 commits intoapache:masterfrom
rob05c:to-fix-tenancy-inactivity
Jan 16, 2019
Merged

Fix Traffic Ops Tenancy and Activity Bugs, Fix TO API Test Framework to work with Tenancy#3163
dangogh merged 2 commits intoapache:masterfrom
rob05c:to-fix-tenancy-inactivity

Conversation

@rob05c
Copy link
Member

@rob05c rob05c commented Dec 25, 2018

This fixes major TO Tenancy bugs, as well as fixing the TO API framework to work with Tenancy enabled, and enabling it.

Unfortunately, all the tenancy issues, and the test framework, are intimately related, and there isn't a good way to separate them; so this PR is unfortunately large.

Specifically:

  • Fix TO Tenancy checks to consider whether the user's tenant is active, not whether the resource's tenant is active.

  • Fix TO Tenancy to prevent access when parent tenants (of the
    user's tenant) are inactive.

  • Add TO API Tests to verify tenancy, verify access is disabled when
    parents or self are inactive.

  • Fix TO API Test user panic on test failure.

  • Fix TO API Test tenants to automatically delete children before
    parents, rather than hard-coding.

  • Fix TO API Test parameters race condition, fixed to delete all
    parameters not only first name+file.

  • Fix TO API Tests to enable Tenancy on all tests (by creating
    Parameters, which creates the use_tenancy param).
    The TO API Tests were completely broken with tenancy, due
    to the above bugs. With the bugs fixed, the tests pass
    with tenancy enabled.

Fixes #2732

What does this PR do?

Fixes #2732

Which TC components are affected by this PR?

  • Documentation
  • Grove
  • Traffic Analytics
  • Traffic Monitor
  • Traffic Ops
  • Traffic Ops ORT
  • Traffic Portal
  • Traffic Router
  • Traffic Stats
  • Traffic Vault
  • Other _________

What is the best way to verify this PR?

Run API tests. Run TO with Tenancy enabled, verify resources are accessible as expected.

Check all that apply

  • This PR includes tests
  • This PR includes documentation updates
  • This PR includes an update to CHANGELOG.md
  • This PR includes all required license headers
  • This PR includes a database migration (ensure that migration sequence is correct)
  • This PR fixes a serious security flaw. Read more: www.apache.org/security

@rob05c rob05c added bug something isn't working as intended new feature A new feature, capability or behavior Traffic Ops related to Traffic Ops high impact impacts the basic function, deployment, or operation of a CDN labels Dec 25, 2018
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes a race, where tests would randomly fail when DSSes weren't in the 20 returned by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes a race, where certain tests would create additional parameters with the same name+file (which is permissible, as long as the values are different). This fixes it to delete all returned parameters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes DeleteTestTenants delete all tenants starting with children, instead of hard-coding which need deleting first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@asfgit
Copy link
Contributor

asfgit commented Dec 25, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/2987/
Test PASSed.

Copy link
Contributor

@moltzaum moltzaum Dec 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rob05c Nearly two months ago @dangogh changed Fatalf to Errorf across all tests because Fatalf tests from cleaning up (f32c5ec). It seems that other tests would have the same panic issue if their create method failed. I want to make sure we have a consistent understanding of how tests should terminate and why.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example outside this PR it seems in crconfig_test.go Dan changed Fatalf to Errorf (2018-10-30) then you used Fatalf again afterwards (2018-11-29). All cases of Errorf vs Fatalf should probably be covered in a separate PR imo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 -- if the Fatalf occurs before the test gets cleaned up, the following test may fail. Really should only use Fatalf if we can't recover from the situation, and that should be made very clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really should only use Fatalf if we can't recover from the situation

That's exactly what happens here, that's why I changed it. If an error is returned, the following line

log.Debugln("Response: ", resp.Alerts)

will panic. If an error is returned, the test can't keep going, which is exactly what Fatalf is for. Because of the panic, the test won't get cleaned up anyway, and subsequent tests will fail anyway. Using Fatal instead of Error avoids the panic stacktrace, and makes it easier to see what went wrong to fix it.

We really need to fix the API Test framework to clean up tests even if they panic or t.Fatalf. I don't think it'll be hard, I think it's as simple as changing all our tests from e.g.

func TestCacheGroups(t *testing.T) {
	CreateTestTypes(t)
	CreateTestCacheGroups(t)
	GetTestCacheGroups(t)
	CheckCacheGroupsAuthentication(t)
	UpdateTestCacheGroups(t)
	DeleteTestCacheGroups(t)
	DeleteTestTypes(t)
}

To

func TestCacheGroups(t *testing.T) {
	defer DeleteTestTypes(t)
	CreateTestTypes(t)
	defer DeleteTestCacheGroups(t)
	CreateTestCacheGroups(t)
	GetTestCacheGroups(t)
	CheckCacheGroupsAuthentication(t)
	UpdateTestCacheGroups(t)
}

(similar to

func TestOrigins(t *testing.T) {
but the defer Deletes need to come before their Creates).

But that should probably be its own PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, thinking about this some more, I think we should use Fatalf anywhere the test can't continue, irrespective of defer cleanups attempts.

There are always going to be cases where "delete" can't clean up appropriately, after a test fails at some point. Test errors are just like compile errors: you should only ever consider the first failure. Because any one failure can cause a chain of other unrelated failures, or worse, false successes. Any one test failure is a failure of the entire test suite.

The only right answer is to only consider the first failure, and cleanup is irrelevant, because all subsequent tests, pass or fail, are invalid.

This means all build pipelines (and people) should be using go test -failfast.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the Fatal cleanup issue has been fixed in #3173

Copy link
Member

@dangogh dangogh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also has conflicts w/ master in the tc-fixtures.json file...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on moving these -- dunno why they got in asn.go in the first place..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a lot of duplicate code.. can the former call the latter with n=-1 (for no limit)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I changed them to use a common func. IMO the abstraction isn't worth the readability cost. So, I changed it in the safest way I could think of.

A sigil (-1 or 0) would be confusing, since not passing a limit results in the default API limit of 20, not infinite (which is the bug adding a N func fixes); likewise passing the API limit would be bug-prone if the API changed; so, passing the query param itself seemed the safest and least confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 -- if the Fatalf occurs before the test gets cleaned up, the following test may fail. Really should only use Fatalf if we can't recover from the situation, and that should be made very clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 -- this is much clearer, although still pretty hairy. I wonder if it would be worth creating this as an SQL function that can be tested/maintained outside of the code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've already decided to eliminate the use_tenancy param and have it now always set to 1. Can we get rid of that portion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we made the decision, but the code doesn't currently enforce it, at present users can still leave tenancy disabled.

I think we need to merge #2791 before removing use_tenancy checks.

@rob05c rob05c force-pushed the to-fix-tenancy-inactivity branch from bc59bb5 to e20885a Compare December 29, 2018 06:11
@rob05c
Copy link
Member Author

rob05c commented Dec 29, 2018

Fixed merge conflicts.

@asfgit
Copy link
Contributor

asfgit commented Dec 29, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/2994/
Test PASSed.

@asfgit
Copy link
Contributor

asfgit commented Dec 29, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/2995/
Test PASSed.

rob05c added 2 commits January 7, 2019 15:24
- Fix TO Tenancy to prevent access when parent tenants are inactive.

- Add TO API Tests to verify tenancy, verify access is disabled when
  parents or self are inactive.

- Fix TO API Test user panic on test failure.

- Fix TO API Test tenants to automatically delete children before
  parents, rather than hard-coding.

- Fix TO API Test parameters race condition, fixed to delete all
  parameters not only first name+file.

- Fix TO API Tests to enable Tenancy on all tests (by creating
  Parameters, which creates the use_tenancy=true param).
  The TO API Tests were completely broken with tenancy, due
  to the above bugs. With the bugs fixed, the tests pass
  with tenancy enabled.

Fixes apache#2732
@rob05c rob05c force-pushed the to-fix-tenancy-inactivity branch from 3961cbb to f8ac49f Compare January 7, 2019 22:24
@asfgit
Copy link
Contributor

asfgit commented Jan 7, 2019

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/trafficcontrol-PR/3014/
Test PASSed.

@dangogh dangogh merged commit b8cc607 into apache:master Jan 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

bug something isn't working as intended high impact impacts the basic function, deployment, or operation of a CDN new feature A new feature, capability or behavior Traffic Ops related to Traffic Ops

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unable to Edit or Delete a Tenant where Active is false

4 participants