feat(BEVFusion): added a positional encoding-like feature extractor by knzo25 · Pull Request #16 · tier4/AWML

knzo25 · 2025-03-13T07:11:17Z

Summary

Added a positional encoding-like feature extractor before the sparse encoder.
The results, mainly for the online model, see their mAP greatly improved.

Change point

The fundamentals behind this idea can be seen from different angles.
A good study was actually performed in the first NERF paper and explain more in detail here:
https://arxiv.org/pdf/2006.10739

From a more engineer-like perspective, we have signal in the range [-120, 120]. The sparse convolutions, product of having a kernel size of 3x3 will have at most 0.17m x 3 of difference between elements in the operation, making it difficult to learn features due to the signal to reference ratio. The reference is actually not completely needed since it is encoded in the coordinated of the convolution. Simply applying multiple sin/cos functions will allow the first layer to learn higher frequency features, in theory allowing for a more discerning detector. This can also be seen from kernel theory an series, but I think is overkill

Note

Although this does not impact other projects and by default this new feature is off, the idea itself can be applied to other models. That being said, the idea behind it is already applied to some degree in the pillar features of centerpoint & transfusion.

Test performed

*Note: these are all the best eval performance of the best epoch (will replace them for the test later)

TIERIV INTERNAL LINK

model	range	mAP	car	truck	bus	bicycle	pedestrian
BEVFusion-L base_120m_v1 (baseline)	122.4m	61.8	77.9	64.5	57.5	53.4	55.7
BEVFusion-L base_120m_v1 (encoding)	122.4m	68.8	81.2	65.3	65.8	70.2	61.5
BEVFusion-L-offline base_120m_v1 (baseline)	122.4m	66.0	85.4	57.4	51.7	68.1	67.2
BEVFusion-L-offline base_120m_v1 (encoding)	122.4m	70.8	85.6	65.6	67.7	67.9	67.1

The link of data and evaluation result

BEVFusion-L base_120m_v1 (baseline)
- Training dataset: db_jpntaxi_v1 + db_jpntaxi_v2 + db_jpntaxi_v4 + db_gsm8_v1 + db_j6_v1 + db_j6_v2 + db_j6_v3 + db_j6_v5 +
- Eval dataset: db_jpntaxi_v1 + db_jpntaxi_v2 + db_jpntaxi_v4 + db_gsm8_v1 + db_j6_v1 + db_j6_v2 + db_j6_v3 + db_j6_v5 +
- Config file path
- Results are in internal data.
- Total mAP to eval dataset (eval range = 120m): 0.618

class_name	mAP	AP@0.5m	AP@1.0m	AP@2.0m	AP@4.0m
car	77.9	64.4	78.3	83.4	85.6
truck	64.5	40.8	62.5	75.0	79.6
bus	57.5	38.1	55.1	63.7	73.1
bicycle	53.4	45.3	54.2	56.3	57.6
pedestrian	55.7	47.5	54.2	58.4	62.7

BEVFusion-L base_120m_v1 (encoding)
- Training dataset: db_jpntaxi_v1 + db_jpntaxi_v2 + db_jpntaxi_v4 + db_gsm8_v1 + db_j6_v1 + db_j6_v2 + db_j6_v3 + db_j6_v5 +
- Eval dataset: db_jpntaxi_v1 + db_jpntaxi_v2 + db_jpntaxi_v4 + db_gsm8_v1 + db_j6_v1 + db_j6_v2 + db_j6_v3 + db_j6_v5 +
- Config file path
- Results are in internal data.
- Total mAP to eval dataset (eval range = 120m): 0.688

class_name	mAP	AP@0.5m	AP@1.0m	AP@2.0m	AP@4.0m
car	81.2	69.5	81.4	86.1	88.0
truck	65.3	42.8	64.5	74.4	79.6
bus	65.8	40.7	66.5	76.2	79.9
bicycle	70.2	66.4	70.9	71.4	72.0
pedestrian	61.5	54.1	59.7	63.9	68.1

BEVFusion-L-offline base_120m_v1 (baseline)
- Training dataset: db_jpntaxi_v1 + db_jpntaxi_v2 + db_jpntaxi_v4 + db_gsm8_v1 + db_j6_v1 + db_j6_v2 + db_j6_v3 + db_j6_v5 +
- Eval dataset: db_jpntaxi_v1 + db_jpntaxi_v2 + db_jpntaxi_v4 + db_gsm8_v1 + db_j6_v1 + db_j6_v2 + db_j6_v3 + db_j6_v5 +
- Config file path
- Results are in internal data.
- Total mAP to eval dataset (eval range = 120m): 0.66

class_name	mAP	AP@0.5m	AP@1.0m	AP@2.0m	AP@4.0m
car	85.4	77.4	85.7	88.8	89.8
truck	57.4	36.3	55.4	66.6	71.2
bus	51.7	38.2	52.2	57.5	58.9
bicycle	68.1	66.3	68.3	68.6	69.4
pedestrian	67.2	64.3	66.1	67.8	70.7

BEVFusion-L-offline base_120m_v1 (encoding)
- Training dataset: db_jpntaxi_v1 + db_jpntaxi_v2 + db_jpntaxi_v4 + db_gsm8_v1 + db_j6_v1 + db_j6_v2 + db_j6_v3 + db_j6_v5 +
- Eval dataset: db_jpntaxi_v1 + db_jpntaxi_v2 + db_jpntaxi_v4 + db_gsm8_v1 + db_j6_v1 + db_j6_v2 + db_j6_v3 + db_j6_v5 +
- Config file path
- Results are in internal data.
- Total mAP to eval dataset (eval range = 120m): 0.708

class_name	mAP	AP@0.5m	AP@1.0m	AP@2.0m	AP@4.0m
car	85.6	77.6	86.0	88.9	89.8
truck	65.6	45.8	65.7	73.9	77.0
bus	67.7	49.2	71.6	74.5	75.6
bicycle	67.9	66.1	68.0	68.3	69.1
pedestrian	67.1	64.0	66.0	67.7	70.8

…irst layer of the sparse encoder, which improves mAP quite a bit Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp>

Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp>

scepter914 · 2025-03-14T05:20:22Z

Thank you for contribution and great work 👍

I want to know about the result in detail, so would you write the result like CenterPoint-ConvNeXtPC in the document?
Especially, I want to confirm what dataset you use for training and evaluation.

- Evaluation result with test-dataset: DB JPNTAXI v1.0 + DB JPNTAXI v2.0 + DB JPNTAXI v3.0 + DB GSM8 v1.0 + DB J6 v1.0 (total frames: 1,394):
  - Total mAP (eval range = 120m): 0.686

| class_name | Count  | mAP  | AP@0.5m | AP@1.0m | AP@2.0m | AP@4.0m |
| ---------- | ------ | ---- | ------- | ------- | ------- | ------- |
| car        | 41,133 | 77.9 | 79.8    | 82.2    | 83.0    | 79.5    |
| truck      | 8,890  | 58.6 | 34.7    | 59.7    | 67.7    | 72.2    |
| bus        | 3,275  | 80.9 | 69.2    | 79.6    | 81.1    | 82.6    |
| bicycle    | 3,635  | 53.2 | 52.3    | 53.4    | 53.5    | 53.6    |
| pedestrian | 25,981 | 64.8 | 62.4    | 64.0    | 65.4    | 67.4    |

In addition to the result, would you upload to S3 model zoo? cc. @SamratThapa120

Signed-off-by: scepter914 <scepter914@gmail.com>

knzo25 · 2025-04-30T09:36:07Z

@scepter914
Apologies for the excessive delay. Formated the output 🙏

scepter914

Implementation looks great to me.
I request @Shin-kyoto to confirm operational check with pseudo labeling, so please wait more.

SamratThapa120 · 2025-06-09T23:20:29Z

@scepter914 @Shin-kyoto
FYI, these models were trained using positional encodings mentioned in this PR. I merged the changes locally, and trained the models

Shin-kyoto · 2025-07-08T05:42:53Z

@SamratThapa120
Can you confirm operational check?

SamratThapa120 · 2025-07-08T06:49:22Z

Closing, will be addressed in
#69

feat: added a positional encoding-like feature extractor before the f…

7388dd8

…irst layer of the sparse encoder, which improves mAP quite a bit Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp>

knzo25 requested a review from scepter914 March 13, 2025 07:11

chore: removed prints

73d1e52

Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp>

knzo25 self-assigned this Mar 13, 2025

knzo25 marked this pull request as ready for review March 13, 2025 07:30

SamratThapa120 pushed a commit that referenced this pull request Mar 27, 2025

chore(docs): add design docs (#16)

006b318

Signed-off-by: scepter914 <scepter914@gmail.com>

Shin-kyoto self-requested a review May 9, 2025 11:02

scepter914 reviewed May 13, 2025

View reviewed changes

Merge branch 'main' into feat/bevfusion_pos_enc

71cbdfa

Shin-kyoto requested a review from SamratThapa120 July 8, 2025 05:42

SamratThapa120 closed this Jul 8, 2025

SamratThapa120 mentioned this pull request Jul 8, 2025

feat(BEVFusion): added a positional encoding-like feature extractor #69

Merged

KSeangTan mentioned this pull request Apr 21, 2026

feat(BEVFusion-LiDAR): release BEVFusion-LiDAR 2.7.x #205

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(BEVFusion): added a positional encoding-like feature extractor#16

feat(BEVFusion): added a positional encoding-like feature extractor#16
knzo25 wants to merge 3 commits intotier4:mainfrom
knzo25:feat/bevfusion_pos_enc

knzo25 commented Mar 13, 2025 •

edited

Loading

Uh oh!

scepter914 commented Mar 14, 2025

Uh oh!

knzo25 commented Apr 30, 2025

Uh oh!

scepter914 left a comment

Uh oh!

SamratThapa120 commented Jun 9, 2025 •

edited

Loading

Uh oh!

Shin-kyoto commented Jul 8, 2025

Uh oh!

SamratThapa120 commented Jul 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

knzo25 commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change point

Note

Test performed

Uh oh!

scepter914 commented Mar 14, 2025

Uh oh!

knzo25 commented Apr 30, 2025

Uh oh!

scepter914 left a comment

Choose a reason for hiding this comment

Uh oh!

SamratThapa120 commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Shin-kyoto commented Jul 8, 2025

Uh oh!

SamratThapa120 commented Jul 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

knzo25 commented Mar 13, 2025 •

edited

Loading

SamratThapa120 commented Jun 9, 2025 •

edited

Loading