Skip to content

Conversation

@nisiyong
Copy link
Contributor

Fix Non-atomic operation on volatile field

Recently I dive into SkyWalking sources code, and I find some bugs in the DataCarrier module.
It runs with something wrong although they won't affect the main feature.

IntelliJ IDEA shows the warning as follow:

Inspection info: Reports any non-atomic operations on volatile fields. Non-atomic operations on volatile fields are operations where the field is read and the value is used to update the field. It is possible for the value of the field to change between the read and the write, possibly invalidating the operation. The non-atomic operation can be avoided by surrounding it with a synchronized block or by making use of one of the classes from the java.util.concurrent.atomic package.

@codecov
Copy link

codecov bot commented Dec 15, 2020

Codecov Report

Merging #6009 (ce5b1a0) into master (065a797) will decrease coverage by 2.05%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #6009      +/-   ##
============================================
- Coverage     51.26%   49.20%   -2.06%     
+ Complexity     3513     3357     -156     
============================================
  Files          1680     1680              
  Lines         35749    35772      +23     
  Branches       3949     3937      -12     
============================================
- Hits          18325    17601     -724     
- Misses        16493    17235     +742     
- Partials        931      936       +5     
Impacted Files Coverage Δ Complexity Δ
...atacarrier/partition/SimpleRollingPartitioner.java 75.00% <100.00%> (ø) 2.00 <1.00> (ø)
...server/storage/plugin/jdbc/h2/H2StorageConfig.java 100.00% <100.00%> (ø) 1.00 <0.00> (ø)
...skywalking/oap/server/core/alarm/AlarmMessage.java 0.00% <0.00%> (-100.00%) 0.00% <0.00%> (-1.00%)
...ement/ui/template/UITemplateManagementService.java 0.00% <0.00%> (-100.00%) 0.00% <0.00%> (-6.00%)
...s/manual/database/DatabaseStatementDispatcher.java 10.00% <0.00%> (-90.00%) 1.00% <0.00%> (-1.00%)
...in/influxdb/query/UITemplateManagementDAOImpl.java 1.75% <0.00%> (-85.97%) 1.00% <0.00%> (-6.00%)
...storage/plugin/jdbc/tidb/TiDBHistoryDeleteDAO.java 0.00% <0.00%> (-81.82%) 0.00% <0.00%> (-6.00%)
.../server/receiver/mesh/TelemetryDataDispatcher.java 7.24% <0.00%> (-78.99%) 2.00% <0.00%> (-11.00%)
.../storage/plugin/jdbc/tidb/TiDBStorageProvider.java 10.00% <0.00%> (-78.00%) 3.00% <0.00%> (-6.00%)
...ap/server/core/alarm/AlarmStandardPersistence.java 11.76% <0.00%> (-76.48%) 2.00% <0.00%> (-2.00%)
... and 49 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 065a797...b682cd8. Read the comment docs.

private volatile ArrayList<Group> consumeTargets;
@SuppressWarnings("NonAtomicVolatileUpdate")
private volatile long size;
private final AtomicLong size = new AtomicLong();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need atomic long? We don't do concurrency consuming in any case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there has another way to fix this warning. Just add synchronized on the MultipleChannelsConsumer#addNewTarget, but it has already added on BulkConsumePool#add which invoke MultipleChannelsConsumer#addNewTarget.

IDE shows warning because the volatile filed has non-atomic operations on this Class, the non-atomic operations should be synchronized. Or we just use AtomicLong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't find the real case, the reason for warning is this could happen, but in our case, there isn't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't mean there has a wrong logic in the release code, maybe it runs well. But I just want to indicate that was a bad practice to use volatile in this way.

If there are multi-threads invoke the MultipleChannelsConsumer#addNewTarget method and the field size still uses volatile, the size maybe a wrong value. Because

The Java volatile keyword guarantees visibility only!

If you don't want to use AtomicLong, I am willing to just add synchronized on the MultipleChannelsConsumer#addNewTarget method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are trying to avoid any possible lock, it is also slower, more or less. That is why it may be not a good idea.
SkyWalking codes are for product environment and high performance, we have no plan to use the codes to guide people how to write codes.

So, the key is, if the current version of codes has a bug, we are open to the solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I can revert MultipleChannelsConsumer first.
But the SimpleRollingPartitioner has a bug, and I have added a test case. Please take a look.

@wu-sheng wu-sheng added the TBD To be decided later, need more discussion or input. label Dec 15, 2020
@SuppressWarnings("NonAtomicVolatileUpdate")
private volatile int i = 0;

private final AtomicInteger i = new AtomicInteger();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's talk about this. I know your point of bug, it would not increase every time. But it has higher performance with no lock, in the partition, there is a lock to avoid the real issue. So this is actually not a bug in the real world.
This is another typical thing, codes are not by the book.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the partition, there is a lock to avoid the real issue.

Could you tell me where the lock is? I want to take a look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Channel index. There is a atomic integer as index pointer.

@wu-sheng wu-sheng added this to the 8.4.0 milestone Dec 20, 2020
@wu-sheng wu-sheng added invalid The description doesn't fit the case. and removed TBD To be decided later, need more discussion or input. labels Dec 20, 2020
@wu-sheng wu-sheng closed this Dec 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

invalid The description doesn't fit the case.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants