Skip to content

Commit e4b4205

Browse files
committed
Merge tag 'trace-v6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt: - Fix adding a new fgraph callback after function graph tracing has already started. If the new caller does not initialize its hash before registering the fgraph_ops, it can cause a NULL pointer dereference. Fix this by adding a new parameter to ftrace_graph_enable_direct() passing in the newly added gops directly and not rely on using the fgraph_array[], as entries in the fgraph_array[] must be initialized. Assign the new gops to the fgraph_array[] after it goes through ftrace_startup_subops() as that will properly initialize the gops->ops and initialize its hashes. - Fix a memory leak in fgraph storage memory test. If the "multiple fgraph storage on a function" boot up selftest fails in the registering of the function graph tracer, it will not free the memory it allocated for the filter. Break the loop up into two where it allocates the filters first and then registers the functions where any errors will do the appropriate clean ups. - Only clear the timerlat timers if it has an associated kthread. In the rtla tool that uses timerlat, if it was killed just as it was shutting down, the signals can free the kthread and the timer. But the closing of the timerlat files could cause the hrtimer_cancel() to be called on the already freed timer. As the kthread variable is is set to NULL when the kthreads are stopped and the timers are freed it can be used to know not to call hrtimer_cancel() on the timer if the kthread variable is NULL. - Use a cpumask to keep track of osnoise/timerlat kthreads The timerlat tracer can use user space threads for its analysis. With the killing of the rtla tool, the kernel can get confused between if it is using a user space thread to analyze or one of its own kernel threads. When this confusion happens, kthread_stop() can be called on a user space thread and bad things happen. As the kernel threads are per-cpu, a bitmask can be used to know when a kernel thread is used or when a user space thread is used. - Add missing interface_lock to osnoise/timerlat stop_kthread() The stop_kthread() function in osnoise/timerlat clears the osnoise kthread variable, and if it was a user space thread does a put_task on it. But this can race with the closing of the timerlat files that also does a put_task on the kthread, and if the race happens the task will have put_task called on it twice and oops. - Add cond_resched() to the tracing_iter_reset() loop. The latency tracers keep writing to the ring buffer without resetting when it issues a new "start" event (like interrupts being disabled). When reading the buffer with an iterator, the tracing_iter_reset() sets its pointer to that start event by walking through all the events in the buffer until it gets to the time stamp of the start event. In the case of a very large buffer, the loop that looks for the start event has been reported taking a very long time with a non preempt kernel that it can trigger a soft lock up warning. Add a cond_resched() into that loop to make sure that doesn't happen. - Use list_del_rcu() for eventfs ei->list variable It was reported that running loops of creating and deleting kprobe events could cause a crash due to the eventfs list iteration hitting a LIST_POISON variable. This is because the list is protected by SRCU but when an item is deleted from the list, it was using list_del() which poisons the "next" pointer. This is what list_del_rcu() was to prevent. * tag 'trace-v6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread() tracing/timerlat: Only clear timer if a kthread exists tracing/osnoise: Use a cpumask to know what threads are kthreads eventfs: Use list_del_rcu() for SRCU protected list variable tracing: Avoid possible softlockup in tracing_iter_reset() tracing: Fix memory leak in fgraph storage selftest tracing: fgraph: Fix to add new fgraph_ops to array after ftrace_startup_subops()
2 parents ad61873 + 5bfbcd1 commit e4b4205

File tree

5 files changed

+73
-35
lines changed

5 files changed

+73
-35
lines changed

fs/tracefs/event_inode.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -862,7 +862,7 @@ static void eventfs_remove_rec(struct eventfs_inode *ei, int level)
862862
list_for_each_entry(ei_child, &ei->children, list)
863863
eventfs_remove_rec(ei_child, level + 1);
864864

865-
list_del(&ei->list);
865+
list_del_rcu(&ei->list);
866866
free_ei(ei);
867867
}
868868

kernel/trace/fgraph.c

Lines changed: 18 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1206,18 +1206,24 @@ static void init_task_vars(int idx)
12061206
read_unlock(&tasklist_lock);
12071207
}
12081208

1209-
static void ftrace_graph_enable_direct(bool enable_branch)
1209+
static void ftrace_graph_enable_direct(bool enable_branch, struct fgraph_ops *gops)
12101210
{
12111211
trace_func_graph_ent_t func = NULL;
12121212
trace_func_graph_ret_t retfunc = NULL;
12131213
int i;
12141214

1215-
for_each_set_bit(i, &fgraph_array_bitmask,
1216-
sizeof(fgraph_array_bitmask) * BITS_PER_BYTE) {
1217-
func = fgraph_array[i]->entryfunc;
1218-
retfunc = fgraph_array[i]->retfunc;
1219-
fgraph_direct_gops = fgraph_array[i];
1220-
}
1215+
if (gops) {
1216+
func = gops->entryfunc;
1217+
retfunc = gops->retfunc;
1218+
fgraph_direct_gops = gops;
1219+
} else {
1220+
for_each_set_bit(i, &fgraph_array_bitmask,
1221+
sizeof(fgraph_array_bitmask) * BITS_PER_BYTE) {
1222+
func = fgraph_array[i]->entryfunc;
1223+
retfunc = fgraph_array[i]->retfunc;
1224+
fgraph_direct_gops = fgraph_array[i];
1225+
}
1226+
}
12211227
if (WARN_ON_ONCE(!func))
12221228
return;
12231229

@@ -1256,8 +1262,6 @@ int register_ftrace_graph(struct fgraph_ops *gops)
12561262
ret = -ENOSPC;
12571263
goto out;
12581264
}
1259-
1260-
fgraph_array[i] = gops;
12611265
gops->idx = i;
12621266

12631267
ftrace_graph_active++;
@@ -1266,7 +1270,7 @@ int register_ftrace_graph(struct fgraph_ops *gops)
12661270
ftrace_graph_disable_direct(true);
12671271

12681272
if (ftrace_graph_active == 1) {
1269-
ftrace_graph_enable_direct(false);
1273+
ftrace_graph_enable_direct(false, gops);
12701274
register_pm_notifier(&ftrace_suspend_notifier);
12711275
ret = start_graph_tracing();
12721276
if (ret)
@@ -1281,14 +1285,15 @@ int register_ftrace_graph(struct fgraph_ops *gops)
12811285
} else {
12821286
init_task_vars(gops->idx);
12831287
}
1284-
12851288
/* Always save the function, and reset at unregistering */
12861289
gops->saved_func = gops->entryfunc;
12871290

12881291
ret = ftrace_startup_subops(&graph_ops, &gops->ops, command);
1292+
if (!ret)
1293+
fgraph_array[i] = gops;
1294+
12891295
error:
12901296
if (ret) {
1291-
fgraph_array[i] = &fgraph_stub;
12921297
ftrace_graph_active--;
12931298
gops->saved_func = NULL;
12941299
fgraph_lru_release_index(i);
@@ -1324,7 +1329,7 @@ void unregister_ftrace_graph(struct fgraph_ops *gops)
13241329
ftrace_shutdown_subops(&graph_ops, &gops->ops, command);
13251330

13261331
if (ftrace_graph_active == 1)
1327-
ftrace_graph_enable_direct(true);
1332+
ftrace_graph_enable_direct(true, NULL);
13281333
else if (!ftrace_graph_active)
13291334
ftrace_graph_disable_direct(false);
13301335

kernel/trace/trace.c

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3958,6 +3958,8 @@ void tracing_iter_reset(struct trace_iterator *iter, int cpu)
39583958
break;
39593959
entries++;
39603960
ring_buffer_iter_advance(buf_iter);
3961+
/* This could be a big loop */
3962+
cond_resched();
39613963
}
39623964

39633965
per_cpu_ptr(iter->array_buffer->data, cpu)->skipped_entries = entries;

kernel/trace/trace_osnoise.c

Lines changed: 34 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -252,21 +252,32 @@ static inline struct timerlat_variables *this_cpu_tmr_var(void)
252252
return this_cpu_ptr(&per_cpu_timerlat_var);
253253
}
254254

255+
/*
256+
* Protect the interface.
257+
*/
258+
static struct mutex interface_lock;
259+
255260
/*
256261
* tlat_var_reset - Reset the values of the given timerlat_variables
257262
*/
258263
static inline void tlat_var_reset(void)
259264
{
260265
struct timerlat_variables *tlat_var;
261266
int cpu;
267+
268+
/* Synchronize with the timerlat interfaces */
269+
mutex_lock(&interface_lock);
262270
/*
263271
* So far, all the values are initialized as 0, so
264272
* zeroing the structure is perfect.
265273
*/
266274
for_each_cpu(cpu, cpu_online_mask) {
267275
tlat_var = per_cpu_ptr(&per_cpu_timerlat_var, cpu);
276+
if (tlat_var->kthread)
277+
hrtimer_cancel(&tlat_var->timer);
268278
memset(tlat_var, 0, sizeof(*tlat_var));
269279
}
280+
mutex_unlock(&interface_lock);
270281
}
271282
#else /* CONFIG_TIMERLAT_TRACER */
272283
#define tlat_var_reset() do {} while (0)
@@ -331,11 +342,6 @@ struct timerlat_sample {
331342
};
332343
#endif
333344

334-
/*
335-
* Protect the interface.
336-
*/
337-
static struct mutex interface_lock;
338-
339345
/*
340346
* Tracer data.
341347
*/
@@ -1612,6 +1618,7 @@ static int run_osnoise(void)
16121618

16131619
static struct cpumask osnoise_cpumask;
16141620
static struct cpumask save_cpumask;
1621+
static struct cpumask kthread_cpumask;
16151622

16161623
/*
16171624
* osnoise_sleep - sleep until the next period
@@ -1675,6 +1682,7 @@ static inline int osnoise_migration_pending(void)
16751682
*/
16761683
mutex_lock(&interface_lock);
16771684
this_cpu_osn_var()->kthread = NULL;
1685+
cpumask_clear_cpu(smp_processor_id(), &kthread_cpumask);
16781686
mutex_unlock(&interface_lock);
16791687

16801688
return 1;
@@ -1945,11 +1953,16 @@ static void stop_kthread(unsigned int cpu)
19451953
{
19461954
struct task_struct *kthread;
19471955

1956+
mutex_lock(&interface_lock);
19481957
kthread = per_cpu(per_cpu_osnoise_var, cpu).kthread;
19491958
if (kthread) {
1950-
if (test_bit(OSN_WORKLOAD, &osnoise_options)) {
1959+
per_cpu(per_cpu_osnoise_var, cpu).kthread = NULL;
1960+
mutex_unlock(&interface_lock);
1961+
1962+
if (cpumask_test_and_clear_cpu(cpu, &kthread_cpumask) &&
1963+
!WARN_ON(!test_bit(OSN_WORKLOAD, &osnoise_options))) {
19511964
kthread_stop(kthread);
1952-
} else {
1965+
} else if (!WARN_ON(test_bit(OSN_WORKLOAD, &osnoise_options))) {
19531966
/*
19541967
* This is a user thread waiting on the timerlat_fd. We need
19551968
* to close all users, and the best way to guarantee this is
@@ -1958,16 +1971,15 @@ static void stop_kthread(unsigned int cpu)
19581971
kill_pid(kthread->thread_pid, SIGKILL, 1);
19591972
put_task_struct(kthread);
19601973
}
1961-
per_cpu(per_cpu_osnoise_var, cpu).kthread = NULL;
19621974
} else {
1975+
mutex_unlock(&interface_lock);
19631976
/* if no workload, just return */
19641977
if (!test_bit(OSN_WORKLOAD, &osnoise_options)) {
19651978
/*
19661979
* This is set in the osnoise tracer case.
19671980
*/
19681981
per_cpu(per_cpu_osnoise_var, cpu).sampling = false;
19691982
barrier();
1970-
return;
19711983
}
19721984
}
19731985
}
@@ -1982,12 +1994,8 @@ static void stop_per_cpu_kthreads(void)
19821994
{
19831995
int cpu;
19841996

1985-
cpus_read_lock();
1986-
1987-
for_each_online_cpu(cpu)
1997+
for_each_possible_cpu(cpu)
19881998
stop_kthread(cpu);
1989-
1990-
cpus_read_unlock();
19911999
}
19922000

19932001
/*
@@ -2021,6 +2029,7 @@ static int start_kthread(unsigned int cpu)
20212029
}
20222030

20232031
per_cpu(per_cpu_osnoise_var, cpu).kthread = kthread;
2032+
cpumask_set_cpu(cpu, &kthread_cpumask);
20242033

20252034
return 0;
20262035
}
@@ -2048,8 +2057,16 @@ static int start_per_cpu_kthreads(void)
20482057
*/
20492058
cpumask_and(current_mask, cpu_online_mask, &osnoise_cpumask);
20502059

2051-
for_each_possible_cpu(cpu)
2060+
for_each_possible_cpu(cpu) {
2061+
if (cpumask_test_and_clear_cpu(cpu, &kthread_cpumask)) {
2062+
struct task_struct *kthread;
2063+
2064+
kthread = per_cpu(per_cpu_osnoise_var, cpu).kthread;
2065+
if (!WARN_ON(!kthread))
2066+
kthread_stop(kthread);
2067+
}
20522068
per_cpu(per_cpu_osnoise_var, cpu).kthread = NULL;
2069+
}
20532070

20542071
for_each_cpu(cpu, current_mask) {
20552072
retval = start_kthread(cpu);
@@ -2579,7 +2596,8 @@ static int timerlat_fd_release(struct inode *inode, struct file *file)
25792596
osn_var = per_cpu_ptr(&per_cpu_osnoise_var, cpu);
25802597
tlat_var = per_cpu_ptr(&per_cpu_timerlat_var, cpu);
25812598

2582-
hrtimer_cancel(&tlat_var->timer);
2599+
if (tlat_var->kthread)
2600+
hrtimer_cancel(&tlat_var->timer);
25832601
memset(tlat_var, 0, sizeof(*tlat_var));
25842602

25852603
osn_var->sampling = 0;

kernel/trace/trace_selftest.c

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -942,7 +942,7 @@ static __init int test_graph_storage_multi(void)
942942
{
943943
struct fgraph_fixture *fixture;
944944
bool printed = false;
945-
int i, ret;
945+
int i, j, ret;
946946

947947
pr_cont("PASSED\n");
948948
pr_info("Testing multiple fgraph storage on a function: ");
@@ -953,22 +953,35 @@ static __init int test_graph_storage_multi(void)
953953
if (ret && ret != -ENODEV) {
954954
pr_cont("*Could not set filter* ");
955955
printed = true;
956-
goto out;
956+
goto out2;
957957
}
958+
}
958959

960+
for (j = 0; j < ARRAY_SIZE(store_bytes); j++) {
961+
fixture = &store_bytes[j];
959962
ret = register_ftrace_graph(&fixture->gops);
960963
if (ret) {
961964
pr_warn("Failed to init store_bytes fgraph tracing\n");
962965
printed = true;
963-
goto out;
966+
goto out1;
964967
}
965968
}
966969

967970
DYN_FTRACE_TEST_NAME();
968-
out:
971+
out1:
972+
while (--j >= 0) {
973+
fixture = &store_bytes[j];
974+
unregister_ftrace_graph(&fixture->gops);
975+
976+
if (fixture->error_str && !printed) {
977+
pr_cont("*** %s ***", fixture->error_str);
978+
printed = true;
979+
}
980+
}
981+
out2:
969982
while (--i >= 0) {
970983
fixture = &store_bytes[i];
971-
unregister_ftrace_graph(&fixture->gops);
984+
ftrace_free_filter(&fixture->gops.ops);
972985

973986
if (fixture->error_str && !printed) {
974987
pr_cont("*** %s ***", fixture->error_str);

0 commit comments

Comments
 (0)