Skip to content

Conversation

@gamidirohan
Copy link

@gamidirohan gamidirohan commented Apr 7, 2025

Fixes a segmentation fault encountered when creating string type constants in OpenVINO.

Problem:

A segmentation fault occurs when attempting to create a constant tensor with string data type using the following Python code:

import openvino.runtime.opset14 as ov
import numpy as np
str_const = ov.constant(np.array(['openvino'], dtype=str))

Root Cause Analysis:

The issue lies within the Constant constructor in src/core/src/op/constant.cpp. Specifically, the constructor chain for string constants includes:

Constant::Constant(const element::Type& type, const Shape& shape, const std::vectorstd::string& values)
: Constant(false, type, shape) {
// ...processing continues...
}

The problem is the passing of false for the memset_allocation parameter in the call to the base constructor. This flag determines how memory is initialized in the allocate_buffer function:

if (m_element_type == ov::element::string) {
    m_data = std::make_shared<StringAlignedBuffer>(num_elements, byte_size, host_alignment(), memset_allocation);
}

Subsequently, the constructor that handles raw data uses std::uninitialized_copy_n:

if (m_element_type == ov::element::string) {
    const auto src_strings = static_cast<const std::string*>(data);
    const auto dst_strings = static_cast<std::string*>(get_data_ptr_nc());
    std::uninitialized_copy_n(src_strings, num_elements, dst_strings);
}

std::uninitialized_copy_n expects the destination memory to be allocated but not initialized. When memset_allocation is false for string constants, the memory isn't properly prepared for string objects, leading to a segmentation fault when std::uninitialized_copy_n is called.

Solution:

This Pull Request modifies the Constant constructor for string types to ensure that memset_allocation is set to true. This will ensure that the memory allocated for string constants is properly initialized, resolving the segmentation fault.

Testing:

The fix has been tested by successfully creating string constants using the code snippet provided in the problem description. Further testing with various string lengths and multiple string elements has also been performed.

Issue Link:

This Pull Request addresses and resolves issue #23611.

@gamidirohan gamidirohan requested a review from a team as a code owner April 7, 2025 09:59
@github-actions github-actions bot added the category: Core OpenVINO Core (aka ngraph) label Apr 7, 2025
@sys-openvino-ci sys-openvino-ci added the ExternalPR External contributor label Apr 7, 2025

Constant::Constant(const element::Type& type, const Shape& shape, const std::vector<std::string>& values)
: Constant(false, type, shape) {
: Constant(type == element::string, type, shape) { // Set memset_allocation to true for string type to fix Issue #23611
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The false is set to to initialize string twice.
The memory is allocated but not initialized
This fix will not solve root problem e.g. simple unit test shows that this code is working correctly:

TEST(constant, string_example) {
    ov::op::v0::Constant c(element::string, Shape{2}, vector<std::string>{"str1", "str2"});
    auto p = c.get_data_ptr<element::Type_t::string>();

    EXPECT_EQ(p[0], "str1");
    EXPECT_EQ(p[1], "str2");
}

@praasz praasz self-assigned this Apr 7, 2025
@mlukasze
Copy link
Contributor

hey @gamidirohan will you find a time to address comments from code review, please?

@gamidirohan
Copy link
Author

hey @gamidirohan will you find a time to address comments from code review, please?

Sure, I am working on the issue, will get back soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: Core OpenVINO Core (aka ngraph) ExternalPR External contributor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Good First Issue][Python API]: Create constant for string tensor and fix segfault

4 participants