Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Internal header file included from public one breaks build of external projects #34035

Closed
rtpsw opened this issue Feb 3, 2023 · 17 comments · Fixed by #34036
Closed

[C++] Internal header file included from public one breaks build of external projects #34035

rtpsw opened this issue Feb 3, 2023 · 17 comments · Fixed by #34036
Assignees
Milestone

Comments

@rtpsw
Copy link
Contributor

rtpsw commented Feb 3, 2023

Describe the bug, including details regarding any error messages, version, and platform.

There are a number of public header files that include internal header files:

$ find cpp/src/arrow/ -name "*.h" | xargs grep -l '_internal.h' | grep -v '_internal.h$'
cpp/src/arrow/compute/exec/swiss_join.h
cpp/src/arrow/compute/exec/util.h
cpp/src/arrow/compute/kernels/common.h
cpp/src/arrow/compute/kernels/row_encoder.h
cpp/src/arrow/dataset/test_util.h

This causes the build of external projects to break when including the public header because the internal one cannot be found since it is not part of the install.

Besides fixing the above public headers, it would be useful to add a (push-time?) check that ensures the above list is empty.

Component(s)

C++

@rtpsw
Copy link
Contributor Author

rtpsw commented Feb 3, 2023

take

@rtpsw
Copy link
Contributor Author

rtpsw commented Feb 3, 2023

@westonpace, I'm interested in very quickly resolving the bug at least for cpp/src/arrow/compute/exec/util.h, which is code you own. Remaining fixes will be left for later.

@westonpace
Copy link
Member

I think the problem is actually that all the headers you listed should be marked internal.

@westonpace
Copy link
Member

I can make a pass at this. A large portion of the headers in the compute folder are improperly labeled external.

@westonpace
Copy link
Member

@rtpsw Are you trying to import cpp/src/arrow/compute/exec/util.h externally? If so, we might need to split the file. Which utilities are you after?

@rtpsw
Copy link
Contributor Author

rtpsw commented Feb 3, 2023

What about arrow/compute/exec/map_node.h? it includes arrow/compute/exec/util.h, so if the latter is made internal it would require the former to also be, but we need it to be public.

@rtpsw
Copy link
Contributor Author

rtpsw commented Feb 3, 2023

Are you trying to import cpp/src/arrow/compute/exec/util.h externally?

I am actually importing arrow/compute/exec/map_node.h externally. This is how I discovered this issue.

@rtpsw
Copy link
Contributor Author

rtpsw commented Feb 3, 2023

My plan for a fix is to refactor TracedNode to be a non-template class. @westonpace, WDYT?

@westonpace
Copy link
Member

My plan for a fix is to refactor TracedNode to be a non-template class. @westonpace, WDYT?

Sounds good. I think I did it so I could access kind_name() but, in retrospect, you can probably just do:

class TracedNode {

  TracedNode(std::string_view kind_name);

};

@rtpsw
Copy link
Contributor Author

rtpsw commented Feb 3, 2023

Also accessing label(), which can be dynamically set, so I'd pass the ExecNode* instead.

@westonpace
Copy link
Member

label() is a bit weirder because I don't think we actually assign labels until after construction. For now, just drop label from the tracing and we will fix it later (there is a larger fix needed for tracing to be useful I think).

@rtpsw
Copy link
Contributor Author

rtpsw commented Feb 3, 2023

See edit of previous comment.

@westonpace
Copy link
Member

Passing ExecNode* sounds good to me.

rtpsw added a commit to rtpsw/arrow that referenced this issue Feb 3, 2023
@rtpsw
Copy link
Contributor Author

rtpsw commented Feb 3, 2023

I fixed the issues around cpp/src/arrow/compute/exec/util.h in the recent commit.

Looking at cpp/src/arrow/compute/exec/swiss_join.h, I see cpp/src/arrow/compute/exec/swiss_join.cc includes it at the top. IIUC, changing the include to cpp/src/arrow/compute/exec/swiss_join_internal.h would require it be moved to its sorted order in the includes?

@westonpace
Copy link
Member

Looking at cpp/src/arrow/compute/exec/swiss_join.h, I see cpp/src/arrow/compute/exec/swiss_join.cc includes it at the top. IIUC, changing the include to cpp/src/arrow/compute/exec/swiss_join_internal.h would require it be moved to its sorted order in the includes?

Yes. Basically everything that is not in arrow/compute/api.h should be marked internal. So there are quite a few headers that will need to change.

@westonpace
Copy link
Member

Also, you can add map_node.h to arrow/compute/api.h. I agree it is a good one to make public.

@rtpsw
Copy link
Contributor Author

rtpsw commented Feb 3, 2023

Basically everything that is not in arrow/compute/api.h should be marked internal. So there are quite a few headers that will need to change.

In this issue, I'd just like to avoid a breaking build of an external project due to inclusion of an installed header file, whether it is included in a api*.h or not. I'd defer on the systematic marking to a separate issue.

westonpace pushed a commit that referenced this issue Feb 3, 2023
…build of external projects (#34036)

See #34035
* Closes: #34035

Authored-by: Yaron Gvili <rtpsw@hotmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
@westonpace westonpace added this to the 12.0.0 milestone Feb 3, 2023
sjperkins pushed a commit to sjperkins/arrow that referenced this issue Feb 10, 2023
…reaks build of external projects (apache#34036)

See apache#34035
* Closes: apache#34035

Authored-by: Yaron Gvili <rtpsw@hotmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
gringasalpastor pushed a commit to gringasalpastor/arrow that referenced this issue Feb 17, 2023
…reaks build of external projects (apache#34036)

See apache#34035
* Closes: apache#34035

Authored-by: Yaron Gvili <rtpsw@hotmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
fatemehp pushed a commit to fatemehp/arrow that referenced this issue Feb 24, 2023
…reaks build of external projects (apache#34036)

See apache#34035
* Closes: apache#34035

Authored-by: Yaron Gvili <rtpsw@hotmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants