Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TMA-4.7 release #140

Merged
merged 2 commits into from Feb 11, 2024
Merged

TMA-4.7 release #140

merged 2 commits into from Feb 11, 2024

Conversation

ayasini
Copy link
Contributor

@ayasini ayasini commented Feb 11, 2024

No description provided.

@ayasini ayasini merged commit 0153e18 into main Feb 11, 2024
3 checks passed
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - tma_info_bottleneck* metrics, an abstraction or summarization of
   the 100+ TMA tree nodes into 12-entry familiar performance metrics.
 - tma_c01_wait and tma_c02_wait metrics measure power-performance
   states.
 - Reduce number of events (multiplexing) for tma_info_system_gflops,
   tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
 - Fixes for tma_info_bottleneck_mispredictions and
   tma_info_bad_spec_branch_misprediction_cost.
 - New tma_info_inst_mix_ippause metric.
 - tma_serializing_operation is raised to level 3.
 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - tma_nop_instructions and tma_shuffles_256b are lowered to level 4
   under tma_other_light_ops_group.
 - Reduced number of events when SMT is off.
 - Tuned thresholds for tma_info_bottleneck_branching_overhead,
   tma_fetch_bandwidth and tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-14-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - Reduce number of events (multiplexing) for tma_info_system_gflops,
   tma_info_core_flopc and tma_info_inst_mix_ipflop.
 - Removal of tma_info_bad_spec_branch_misprediction_cost.
 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-15-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - Reduce number of events (multiplexing) for tma_info_system_gflops,
   tma_info_core_flopc and tma_info_inst_mix_ipflop.
 - Removal of tma_info_bad_spec_branch_misprediction_cost.
 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-16-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - Reduce number of events (multiplexing) for tma_info_system_gflops,
   tma_info_core_flopc and tma_info_inst_mix_ipflop.
 - Removal of tma_info_bad_spec_branch_misprediction_cost.
 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-17-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - tma_info_bottleneck* metrics, an abstraction or summarization of
   the 100+ TMA tree nodes into 12-entry familiar performance metrics.
 - Reduce number of events (multiplexing) for tma_info_system_gflops,
   tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
 - Fixes for tma_info_bottleneck_mispredictions and
   tma_info_bad_spec_branch_misprediction_cost.
 - New tma_info_inst_mix_ippause metric.
 - tma_serializing_operation is raised to level 3.
 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - tma_nop_instructions and tma_shuffles_256b are lowered to level 4
   under tma_other_light_ops_group.
 - Reduced number of events when SMT is off.
 - Tuned thresholds for tma_info_bottleneck_branching_overhead,
   tma_fetch_bandwidth and tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-18-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - Tuned thresholds for tma_fetch_bandwidth and
   tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-19-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - Tuned thresholds for tma_fetch_bandwidth and
   tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-20-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - tma_info_bottleneck* metrics, an abstraction or summarization of
   the 100+ TMA tree nodes into 12-entry familiar performance metrics.
 - Reduce number of events (multiplexing) for tma_info_system_gflops,
   tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
 - Fixes for tma_info_bottleneck_mispredictions and
   tma_info_bad_spec_branch_misprediction_cost.
 - New tma_info_inst_mix_ippause metric.
 - tma_serializing_operation is raised to level 3.
 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - tma_nop_instructions and tma_shuffles_256b are lowered to level 4
   under tma_other_light_ops_group.
 - Reduced number of events when SMT is off.
 - Tuned thresholds for tma_info_bottleneck_branching_overhead,
   tma_fetch_bandwidth and tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-21-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - tma_info_bottleneck* metrics, an abstraction or summarization of
   the 100+ TMA tree nodes into 12-entry familiar performance metrics.
 - Reduce number of events (multiplexing) for tma_info_system_gflops,
   tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
 - Fixes for tma_info_bottleneck_mispredictions and
   tma_info_bad_spec_branch_misprediction_cost.
 - New tma_info_inst_mix_ippause metric.
 - tma_serializing_operation is raised to level 3.
 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - tma_nop_instructions and tma_shuffles_256b are lowered to level 4
   under tma_other_light_ops_group.
 - Reduced number of events when SMT is off.
 - Tuned thresholds for tma_info_bottleneck_branching_overhead,
   tma_fetch_bandwidth and tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-22-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - Reduced number of events when SMT is off.
 - Tuned thresholds for tma_fetch_bandwidth and
   tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-23-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - Reduced number of events when SMT is off.
 - Tuned thresholds for tma_fetch_bandwidth and
   tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-24-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - Tuned thresholds for tma_fetch_bandwidth.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-25-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - tma_info_bottleneck* metrics, an abstraction or summarization of
   the 100+ TMA tree nodes into 12-entry familiar performance metrics.
 - Reduce number of events (multiplexing) for tma_info_system_gflops,
   tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
 - Fixes for tma_info_bottleneck_mispredictions and
   tma_info_bad_spec_branch_misprediction_cost.
 - New tma_info_inst_mix_ippause metric.
 - tma_serializing_operation is raised to level 3.
 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - tma_nop_instructions and tma_shuffles_256b are lowered to level 4
   under tma_other_light_ops_group.
 - Reduced number of events when SMT is off.
 - Tuned thresholds for tma_info_bottleneck_branching_overhead,
   tma_fetch_bandwidth and tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-26-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - Add metrics tma_fp_vector_128b, tma_fp_vector_256b and
   tma_info_system_cpus_utilized.
 - Remove metrics tma_info_system_mem_parallel_requests,
   tma_info_system_core_frequency and
   tma_info_system_mem_request_latency.
 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - Tuned thresholds for tma_fetch_bandwidth.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-27-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - tma_info_bottleneck* metrics, an abstraction or summarization of
   the 100+ TMA tree nodes into 12-entry familiar performance metrics.
 - tma_c01_wait and tma_c02_wait metrics measure power-performance
   states.
 - Reduce number of events (multiplexing) for tma_info_system_gflops,
   tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
 - Fixes for tma_info_bottleneck_mispredictions and
   tma_info_bad_spec_branch_misprediction_cost.
 - New tma_info_inst_mix_ippause metric.
 - tma_serializing_operation is raised to level 3.
 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - tma_nop_instructions and tma_shuffles_256b are lowered to level 4
   under tma_other_light_ops_group.
 - Reduced number of events when SMT is off.
 - Tuned thresholds for tma_info_bottleneck_branching_overhead,
   tma_fetch_bandwidth and tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-28-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

     - tma_info_bottleneck* metrics, an abstraction or summarization of
       the 100+ TMA tree nodes into 12-entry familiar performance metrics.
     - Reduce number of events (multiplexing) for tma_info_system_gflops,
       tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
     - Fixes for tma_info_bottleneck_mispredictions and
       tma_info_bad_spec_branch_misprediction_cost.
     - tma_serializing_operation is raised to level 3.
     - Swapped tma_info_core_ilp (becomes per SMT thread) and
       tma_info_pipeline_execute (per physical core).
     - tma_nop_instructions and tma_shuffles_256b are lowered to level 4
       under tma_other_light_ops_group.
     - Reduced number of events when SMT is off.
     - Tuned thresholds for tma_info_bottleneck_branching_overhead,
       tma_fetch_bandwidth and tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-29-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

     - tma_info_bottleneck* metrics, an abstraction or summarization of
       the 100+ TMA tree nodes into 12-entry familiar performance metrics.
     - Reduce number of events (multiplexing) for tma_info_system_gflops,
       tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
     - Fixes for tma_info_bottleneck_mispredictions and
       tma_info_bad_spec_branch_misprediction_cost.
     - tma_serializing_operation is raised to level 3.
     - Swapped tma_info_core_ilp (becomes per SMT thread) and
       tma_info_pipeline_execute (per physical core).
     - tma_nop_instructions and tma_shuffles_256b are lowered to level 4
       under tma_other_light_ops_group.
     - Reduced number of events when SMT is off.
     - Tuned thresholds for tma_info_bottleneck_branching_overhead,
       tma_fetch_bandwidth and tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-30-irogers@google.com
intel-lab-lkp pushed a commit to intel-lab-lkp/linux that referenced this pull request Feb 21, 2024
Top-Down Microarchitecture Analysis (TMA) metrics simplify
cycle-accounting using microarchitecture-abstracted metrics
organized in one hierarchy. This update is from version 4.5 to
4.7.

The update includes:

 - tma_info_bottleneck* metrics, an abstraction or summarization of
   the 100+ TMA tree nodes into 12-entry familiar performance metrics.
 - Reduce number of events (multiplexing) for tma_info_system_gflops,
   tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0.
 - Fixes for tma_info_bottleneck_mispredictions and
   tma_info_bad_spec_branch_misprediction_cost.
 - New tma_info_inst_mix_ippause metric.
 - tma_serializing_operation is raised to level 3.
 - Swapped tma_info_core_ilp (becomes per SMT thread) and
   tma_info_pipeline_execute (per physical core).
 - tma_nop_instructions and tma_shuffles_256b are lowered to level 4
   under tma_other_light_ops_group.
 - Reduced number of events when SMT is off.
 - Tuned thresholds for tma_info_bottleneck_branching_overhead,
   tma_fetch_bandwidth and tma_ports_utilized_3m.

The update came from:

intel/perfmon#140
intel/perfmon#138

Running the script:

https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214011820.644458-31-irogers@google.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant