Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

older series are not show #722

Open
curtisecombsjr opened this issue Dec 22, 2023 · 10 comments
Open

older series are not show #722

curtisecombsjr opened this issue Dec 22, 2023 · 10 comments

Comments

@curtisecombsjr
Copy link

curtisecombsjr commented Dec 22, 2023

Hello! We have several hundred series in .working_dir/series but pavilion seems to only "know about" the last 2 and outputs the incorrect tests for older series when queried with pav status s<sid>


~/.pavilion/working_dir/series> ls

... (truncated) ...

0000105  0000218  0000331  0000444  0000557  0000670  0000784
0000106  0000219  0000332  0000445  0000558  0000671  0000785
0000107  0000220  0000333  0000446  0000559  0000672  0000786
0000108  0000221  0000334  0000447  0000560  0000673  0000787
0000109  0000222  0000335  0000448  0000561  0000674  0000788
0000110  0000223  0000336  0000449  0000562  0000675  next_id
0000111  0000224  0000337  0000450  0000563  0000676  series_info_transform.db
0000112  0000225  0000338  0000451  0000564  0000677
0000113  0000226  0000339  0000452  0000565  0000678
~/.pavilion/working_dir/series> ls -lad 0* | wc -l
787

~/.pavilion/working_dir/series> pav list series
s788 s787

~/.pavilion/working_dir/series> ls 0000675/
0003009  0003010  0003011  0003012  config  dependency  series.out  series.pgid

~/.pavilion/working_dir/series> pav status s675
 Test statuses
---------+-----------------------+----------+-------------+--------------------
 Test id | Name                  | State    | Time        | Note
---------+-----------------------+----------+-------------+--------------------
 3460    | streams-pbs-stg2.base | COMPLETE | 21 19:54:06 | The test completed
         |                       |          |             | with result: PASS
 3461    | hpl-pbs-stg2.base     | COMPLETE | 21 20:38:59 | The test completed
         |                       |          |             | with result: PASS
 3462    | dgemm-pbs-stg2.base   | COMPLETE | 21 21:27:09 | The test completed
         |                       |          |             | with result: PASS
 3463    | hpcg-pbs-stg2.base    | COMPLETE | 21 22:11:01 | The test completed
         |                       |          |             | with result: PASS
 3464    | p2p-pbs-stg2.base     | COMPLETE | 21 22:17:11 | The test completed
         |                       |          |             | with result: PASS
 3465    | gfscpu-pbs-stg2.base  | COMPLETE | 21 22:36:42 | The test completed
         |                       |          |             | with result: PASS
~/.pavilion/working_dir/series>

I am using "pav status" on an older series, 675. The current series is 788, however the testids for s788 are shown for s675.

(notice here, that the Pavilion testids listed in the 0000675 directory are 3009-3012, NOT 3460-3465)

This only started happening today. Thank you so much!

@curtisecombsjr
Copy link
Author

curtisecombsjr commented Dec 24, 2023

Strange, but I enabled "log_level: debug" to see if i could find an error, but that seems to have broken it even more. Now the pav.log does not show any output and the series list is completely empty...


~> pav list series
No matching items found.
~>

Also a we are running 2.3 if that helps.

:~> pav --version
Pavilion 2.3
:~>

Thanks

@Paul-Ferrell
Copy link
Collaborator

Paul-Ferrell commented Jan 2, 2024

Sorry for the delay - we've been out the last few weeks for holiday break.

Double check that working directory path Pavilion is using is what you think it is.
pav show config will show you the Pavilion config settings as Pavilion sees them. Double check the path as given there.

@Paul-Ferrell
Copy link
Collaborator

Also, pav show config_dir will list all the config directories.

@curtisecombsjr
Copy link
Author

curtisecombsjr commented Jan 2, 2024

Paul,

Thanks so much. Everything in the output here seems to be correct. Let me attach it to this comment, so that you can see for yourself, but as far as I can tell, the directories line up and match. New mystery: pav list series now shows no series. Very strange.
pav.txt

@Paul-Ferrell
Copy link
Collaborator

Paul-Ferrell commented Jan 2, 2024 via email

@curtisecombsjr
Copy link
Author

I tried to switch to the latest clone that I could, but unfortunately, our nodes aren't fully updated and i got errors, so i had to switch back. That being said, I do not think that this version is that old, might be a year-ish. Is there a way that I can check? It would be very difficult for us to upgrade at this moment, either way, sadly. A node OS update won't be for a while and even then we are not updating to the latest SP (Suse) here's my yaml.
pav_config.txt

@Paul-Ferrell
Copy link
Collaborator

Paul-Ferrell commented Jan 2, 2024 via email

@curtisecombsjr
Copy link
Author

curtisecombsjr commented Jan 2, 2024

Could this be helpful? It's been moved around and I wasn't the original person that installed it. Kinda just inherited it:

hpc-adm@clogin09:/apps/dev/pavilion/src/pavilion2> git log | cat | head -20
commit 6d6e359bc551e3981b14a5516a4028b96e2f3042
Author: Francine Lapid <55203623+francinelapid@users.noreply.github.com>
Date:   Tue Apr 6 11:17:44 2021 -0600

    fixed command-line overrides (#397)

    Co-authored-by: Paul Ferrell <51765748+Paul-Ferrell@users.noreply.github.com>

commit 07e9ebe47b5504132832c7564c97b813508337ca
Author: Francine Lapid <55203623+francinelapid@users.noreply.github.com>
Date:   Tue Apr 6 11:05:24 2021 -0600

    cancels entire series if there's a scheduler error (#399)

    * cancels entire series if there's a scheduler error

    * includes test id in output error

    * replaced by_sigterm parameter with message

hpc-adm@clogin09:/apps/dev/pavilion/src/pavilion2> git rev-parse --short HEAD
6d6e359b
hpc-adm@clogin09:/apps/dev/pavilion/src/pavilion2>

@Paul-Ferrell
Copy link
Collaborator

Paul-Ferrell commented Jan 2, 2024 via email

@curtisecombsjr
Copy link
Author

Awesome, thank you! And take your time. This is not seriously affecting us at the moment. We run pavilion tests for our burn-ins and it's working fine for those, it would just be a problem if we needed to look at the past results (which does happen, but not that often). Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants