Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-14532 gurt: fix environment APIs hook #13579

Merged
merged 7 commits into from
Jan 30, 2024

Conversation

knard-intel
Copy link
Contributor

@knard-intel knard-intel commented Jan 8, 2024

Description

This PR is based on the PR #13483 allowing thread safe management of environment variables.
This PR mainly replace call to the standard getenv() function with d_isenv_def(), d_agetenv() and d_freeenv_str().
It also renames the d_free_env_str() function intoduce in the previous PR to d_freeenv_str().

Validation

The run #17 of the CI was done with the CI tag Allow-unstable-test: true to check that no regression was introduced with the Features: dfuse tests. This first tag was needed as the test dfuse/daos_build.py is not supporting change in the DAOS API and this patch is renamed the function d_free_env_str() into d_freeenv_str(). However, this last cosmetic modification should be acceptable as the function d_free_env_str() was recently introduced by the PR #13483, and thus this function is not available in any DAOS release. At the end, the only failure with the CI run #17 is an expected one related to the test dfuse/daos_build.py (more derails could be found in the log file debug.log of the test).

Before requesting gatekeeper:

  • Two review approvals and any prior change requests have been resolved.
  • Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
  • Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
  • Commit messages follows the guidelines outlined here.
  • Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

  • You are the appropriate gatekeeper to be landing the patch.
  • The PR has 2 reviews by people familiar with the code, including appropriate watchers.
  • Githooks were used. If not, request that user install them and check copyright dates.
  • Checkpatch issues are resolved. Pay particular attention to ones that will show up on future PRs.
  • All builds have passed. Check non-required builds for any new compiler warnings.
  • Sufficient testing is done. Check feature pragmas and test tags and that tests skipped for the ticket are run and now pass with the changes.
  • If applicable, the PR has addressed any potential version compatibility issues.
  • Check the target branch. If it is master branch, should the PR go to a feature branch? If it is a release branch, does it have merge approval in the JIRA ticket.
  • Extra checks if forced landing is requested
    • Review comments are sufficiently resolved, particularly by prior reviewers that requested changes.
    • No new NLT or valgrind warnings. Check the classic view.
    • Quick-build or Quick-functional is not used.
  • Fix the commit message upon landing. Check the standard here. Edit it to create a single commit. If necessary, ask submitter for a new summary.

Copy link

github-actions bot commented Jan 8, 2024

Bug-tracker data:
Ticket title is 'frontera: segfault on mvapich init'
Status is 'In Review'
Labels: 'TACC,TACC_Frontera,triaged'
https://daosio.atlassian.net/browse/DAOS-14532

@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Medium Verbs Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13579/1/execution/node/1362/log

@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13579/1/execution/node/1491/log

@@ -896,6 +896,11 @@ crt_init_opt(crt_group_id_t grpid, uint32_t flags, crt_init_options_t *opt)
D_FREE(domain0);
D_FREE(provider_str0);
D_FREE(auth_key0);
d_freeenv_str(&port_str);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just save the return from d_agetenv and call these later rather than adding the extra dup?

Copy link
Contributor Author

@knard-intel knard-intel Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to find the less disruptive solution without introducing new variables, and I also did not think of this approach.
At the end, I really prefer your solution which prevents useless extra memory allocation. Thus, I will update all the patch with your proposal.

  • Remove useless strdup() with saving d_agetenv() output result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to find the less disruptive solution without introducing new variables, and I also did not think of this approach. At the end, I really prefer your solution which prevents useless extra memory allocation. Thus, I will update all the patch with your proposal.

  • Remove useless strdup() with saving d_agetenv() output result

Fixed with commit 761ecbe

printf("Error: interface (OFI_INTERFACE) is not set\n");
printf("Example: export OFI_INTERFACE=eth0\n");
D_GOTO(cleanup, ret = -DER_INVAL);
}

if (attach_info_path)
attach_path = attach_info_path;
D_STRNDUP(attach_path, attach_info_path, strlen(attach_info_path));
Copy link
Contributor Author

@knard-intel knard-intel Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Remove useless strdup() with saving d_agetenv() output result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Remove useless strdup() with saving d_agetenv() output result

Fixed with commit 761ecbe

@knard-intel knard-intel force-pushed the ckochhof/fix/master/daos-14532-part2 branch from 10b7b53 to 761ecbe Compare January 11, 2024 09:55
@knard-intel knard-intel changed the base branch from master to ckochhof/fix/master/daos-14896 January 11, 2024 09:57
@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Medium completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13579/3/testReport/

@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13579/3/execution/node/1480/log

Base automatically changed from ckochhof/fix/master/daos-14896 to master January 16, 2024 18:37
@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Medium completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13579/4/testReport/

@knard-intel knard-intel force-pushed the ckochhof/fix/master/daos-14532-part2 branch from 761ecbe to a5cb7ba Compare January 17, 2024 09:55
@daosbuild1
Copy link
Collaborator

Test stage Functional on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13579/5/testReport/

@daosbuild1
Copy link
Collaborator

Test stage Functional on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13579/6/testReport/

@knard-intel knard-intel force-pushed the ckochhof/fix/master/daos-14532-part2 branch from a5cb7ba to 039707a Compare January 17, 2024 21:38
@knard-intel knard-intel changed the base branch from master to ckochhof/hotfix/maser/daos-14981 January 17, 2024 21:39
@daosbuild1
Copy link
Collaborator

Test stage Functional on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13579/9/testReport/

@knard-intel knard-intel force-pushed the ckochhof/fix/master/daos-14532-part2 branch from 039707a to 258434a Compare January 18, 2024 07:38
@daosbuild1
Copy link
Collaborator

Test stage Functional on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13579/10/testReport/

@knard-intel knard-intel force-pushed the ckochhof/fix/master/daos-14532-part2 branch from 258434a to 9a20c80 Compare January 19, 2024 11:11
src/cart/crt_init.c Outdated Show resolved Hide resolved
src/cart/crt_init.c Outdated Show resolved Hide resolved
Base automatically changed from ckochhof/hotfix/maser/daos-14981 to master January 19, 2024 13:06
src/mgmt/cli_mgmt.c Outdated Show resolved Hide resolved
@daosbuild1
Copy link
Collaborator

Test stage Functional on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13579/12/testReport/

src/cart/crt_init.c Outdated Show resolved Hide resolved
src/cart/crt_init.c Outdated Show resolved Hide resolved
@knard-intel knard-intel force-pushed the ckochhof/fix/master/daos-14532-part2 branch from 7ccefef to ff596d2 Compare January 19, 2024 21:34
@daosbuild1
Copy link
Collaborator

Test stage Functional on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13579/17/testReport/

Replace getenv() function with d_agetenv_str() and d_free_env_str()

Features: mpiio
Required-githooks: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@intel.com>
Replace d_free_env_str() function with d_freeenv_str()

Features: mpiio
Required-githooks: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@intel.com>
Integrate reviwers comments:
- Remove useless strdup() with saving d_agetenv() output result

Features: mpiio
Required-githooks: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@intel.com>
Replace getenv() function call with d_agetenv_str() and d_freeenv_str().

Allow-unstable-test: true
Features: mpiio dfuse
Required-githooks: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@intel.com>
Integrate reviewers comments:
- Move the var_names outside the function
- Move initializers into declarations.
- Simplify debug information printing

Allow-unstable-test: true
Features: mpiio dfuse
Required-githooks: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@intel.com>
fix clang-format issue.

Allow-unstable-test: true
Features: mpiio dfuse
Required-githooks: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@intel.com>
fix clang-format issue and integrate reviewers comments:
- Use local variable

Features: mpiio
Required-githooks: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@intel.com>
@knard-intel knard-intel force-pushed the ckochhof/fix/master/daos-14532-part2 branch from 362059f to 0264933 Compare January 22, 2024 07:26
@knard-intel
Copy link
Contributor Author

knard-intel commented Jan 22, 2024

The run #17 of the CI was done with the CI tag Allow-unstable-test: true to check that no regression was introduced with the Features: dfuse tests. This first tag was needed as the test dfuse/daos_build.py is not supporting change in the DAOS API and this patch is renamed the function d_free_env_str() into d_freeenv_str(). However, this last cosmetic modification should be acceptable as the function d_free_env_str() was recently introduced by the PR #13483, and thus this function is not available in any DAOS release. At the end, the only failure with the CI run #17 is an expected one related to the test dfuse/daos_build.py (more derails could be found in the log file debug.log of the test).

@knard-intel knard-intel marked this pull request as ready for review January 22, 2024 21:20
@knard-intel knard-intel requested review from a team as code owners January 22, 2024 21:20
@knard-intel
Copy link
Contributor Author

@ashleypittman , please could you have a look at the modification and check if they are OK for you.
Thanks in advance.

@@ -106,7 +106,7 @@ __shim_handle__daos_init(PyObject *self, PyObject *args)

rc = daos_init();
if ((rc == 0) && (use_glob_eq == 0)) {
override = getenv("PYDAOS_GLOB_EQ");
d_agetenv_str(&override, "PYDAOS_GLOB_EQ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a case for d_getenv_bool() but I can make this change in other pydaos work I'm doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I have to repush, I will fix this.
Otherwise, if you can do it in your PR, it will be perfect: thanks for your proposal :)
Otherwise, I will do it in a follow up PR.

@knard-intel knard-intel requested a review from a team January 25, 2024 15:42
@knard-intel
Copy link
Contributor Author

@daos-stack/daos-gatekeeper please could you land this PR with the following message:

Title: DAOS-14532 gurt: fix environment APIs hook
Description:

Replace call to the standard getenv() function with d_isenv_def(), d_agetenv() and d_freeenv_str().
It also renames the d_free_env_str() function intoduce in previous related PRs to d_freeenv_str().

@knard-intel knard-intel self-assigned this Jan 29, 2024
@knard-intel
Copy link
Contributor Author

@daos-stack/daos-gatekeeper could it be possible to land this PR or something is missing ?

@ashleypittman ashleypittman merged commit 9f78878 into master Jan 30, 2024
54 checks passed
@ashleypittman ashleypittman deleted the ckochhof/fix/master/daos-14532-part2 branch January 30, 2024 08:28
brianjmurrell pushed a commit that referenced this pull request Jan 31, 2024
Replace call to the standard getenv() function with d_isenv_def(), d_agetenv() and d_freeenv_str().
It also renames the d_free_env_str() function intoduce in previous related PRs to d_freeenv_str().


# NOTE: The following default set of pragmas will result in no build
#       or test.
#
#       Please edit the list of pragmas to produce the minimal amount of build
#       and test to prove your PR works.
#
#       Or if you have already done the above, and your PR is ready for it's
#       final for-landing run, remove all of the following pragmas.
#
#       More detail can be found at
#       https://wiki.hpdd.intel.com/display/CI/Commit+Pragmas.
# You should not use this one without prior approval:
#Priority: 3
# Run the GitHub Actions test jobs
#Run-GHA: false
Skip-checkpatch: true
Skip-python-bandit: true
Skip-build: true
# Or you can skip specific PR build stages:
#Skip-build-ubuntu20-rpm: true
#Skip-build-el8-rpm: true
#Skip-build-leap15-rpm: true
#Skip-build-el8-gcc: true
#Skip-build-el8-gcc-debug: true
#Skip-build-el8-gcc-release: true
#Skip-build-leap15-gcc: true
#Skip-build-leap15-icc: true
#Skip-build-ubuntu-clang: true

# And/or you could just enable quick-build below which reduces build tims considerably
Quick-build: true
# Just build and run Functional tests, skipping all others
Quick-Functional: true
# or building with multiple jobs
#Parallel-build: true

#Skip-unit-tests: true
# Or you could choose which Unit Test stages to skip below
#Skip-nlt: true
#Skip-unit-test: true
#Skip-unit-test-memcheck: true
#Skip-test: true

# Every run should include the features it is testing.  Uncomment the below
# line and add the features (space separated) your PR should be testing for.
# I.e. rebuild, ...
#Features:

# Or you can choose which test stages to skip below
#Skip-coverity-test: true
#Skip-fault-injection-test: true
# Skip all Functional test stages
#Skip-func-test: true
# Skip all Functional test stages on VMs
#Skip-func-test-vm: true
# Don't skip the valgrind functional test
#Skip-func-test-vm-valgrind: false
# Skip Functional test stages on EL7, EL8 or Leap 15
#Skip-func-test-el7: true
#Skip-func-test-el8: true
#Skip-func-test-leap15: true
# Skip all Functional test stages on hardware
#Skip-func-hw-test: true
# or just specific hardware test stages
#Skip-func-hw-test-small: true
#Skip-func-hw-test-medium: true
#Skip-func-hw-test-medium-verbs-provider: true
#Skip-func-hw-test-medium-ucx-provider: true
#Skip-func-hw-test-large: true
#Skip-scan-rpms: true
#Skip-test-rpms: true
# or per distro variants:
#Skip-scan-centos-rpms: true
#Skip-scan-centos-7-rpms: true
#Skip-scan-centos-8-rpms: true
#Skip-scan-leap-15-rpms: true
#Skip-test-centos-rpms: true
#Skip-test-centos-7-rpms: true
#Skip-test-centos-8-rpms: true
#Skip-test-centos-8.3-rpms: true

# If you do run any of the above Functional test stages, please use the following to
# limit your tests run to the tests/features you are working on
# You can add the tags for your tests to the appropriate pragmas below:
#Test-tag: -hw
#Test-tag-hw-small: hw,small
#Test-tag-hw-medium: hw,medium,ib2
#Test-tag-hw-large: hw,large

# If you want to allow Hardware testing to run even if VM testing fails:
Allow-unstable-test: true

# if you want to test your PR with a PR of a component, you can speify that PR:
#PR-repos: project@PR-number[:build_num]

# if you want to disable using RPMs in the Functional tests:
#RPM-test: false

# Or if you want to test with daos RPMs that are already built (i.e. in another PR or in the repo):
#RPM-test-version: version[-release]
# to get the latest RPMs for the current verison:
#RPM-test-version: 2.5.100

# This should be figured out automatically, but you can force it with
#Doc-only: true

# If you want to use a repo-files PR
# Repo-files-PR: PR-123

# If you'd prefer not to have your PRs cluttered up with stage failure
# comments, uncomment this
# Skip-PR-comments: true

# ansible-role-snapshot_host-branch: branch-name
# ansible-role-kvm-branch: branch-name

# to use a different cluster for various stages
# VM1-label: ci_vm1
# Ubuntu-VM9-label: ci_vm9
# Leap15-VM9-label: ci_vm9
# EL8-VM9-label: ci_vm9
# HW-medium-label: ci_nvme5
# HW-large-label: ci_nvme9

# to tell CI not to skip a test due to a ticket
# Fixes: DAOS-1234
#
# If you are ready to push a full CI build and test you can uncomment
# the below pragma:
#Full-CI-run: true
Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@intel.com>
jolivier23 pushed a commit that referenced this pull request Feb 28, 2024
Replace call to the standard getenv() function with d_isenv_def(), d_agetenv() and d_freeenv_str().
It also renames the d_free_env_str() function intoduce in previous related PRs to d_freeenv_str().

Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants