Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnxruntime问题请教 #8

Open
rrjia opened this issue Jul 28, 2021 · 14 comments
Open

onnxruntime问题请教 #8

rrjia opened this issue Jul 28, 2021 · 14 comments
Labels
onnxruntime c++ API doc question Further information is requested

Comments

@rrjia
Copy link

rrjia commented Jul 28, 2021

Ort::Env m_env;
Ort::Session m_session;
请问这两个关系是怎么样的,之前看onnxruntime的文档介绍,Ort::Env是一个全局唯一的,如果要实现一个生产者消费者的推理模块来扩大推理引擎的并发性,是不是所有线程共用一个Ort::Env,每个消费者线程新建一个Ort::Session对象?麻烦不吝指教

@DefTruth
Copy link
Owner

ENV确实应该是全局的。如果你是在一个类中使用ENV,那么ENV需要是类内全局的,它的生命周期和这个类一致。也就说,ENV必须是这个类的一个属性,而不应该是类的某个方法中临时变量。因为Session绑定了ENV,如果ENV是方法的临时变量,则在方法退出时会被销毁,此时Session就会指向一个不存在的资源。按照这个逻辑,如果你想实现一个类,在这个类中使用多线程,每个线程有单独的Session,可尝试的选择就是,让ENV成为这个类的属性,在每个子线程的Session共享这个ENV。如果ENV是属于线程,而Session的生命周期却不限于子线程,那么就有可能引发堆栈溢出,而ENV是类全局(类属性)时,则可避免这个问题,此时无论Session何时调用,都是指向有效的ENV。简单来说就是,如果你能确保ENV的生命周期大于等于你所使用的Session的生命周期,应该是比较合理的。

#ifdef ORT_API_MANUAL_INIT
const OrtApi* Global<T>::api_{};
inline void InitApi() { Global<void>::api_ = OrtGetApiBase()->GetApi(ORT_API_VERSION); }
#else
const OrtApi* Global<T>::api_ = OrtGetApiBase()->GetApi(ORT_API_VERSION);
#endif

// This returns a reference to the OrtApi interface in use, in case someone wants to use the C API functions
inline const OrtApi& GetApi() { return *Global<void>::api_; }
struct Session : Base<OrtSession> {
  explicit Session(std::nullptr_t) {}
  Session(Env& env, const ORTCHAR_T* model_path, const SessionOptions& options);
  Session(Env& env, const void* model_data, size_t model_data_length, const SessionOptions& options);
  // ...
}
struct Env : Base<OrtEnv> {
  Env(std::nullptr_t) {}
  Env(OrtLoggingLevel logging_level = ORT_LOGGING_LEVEL_WARNING, _In_ const char* logid = "");
  // ...
}

template <typename T>
struct Base {
  using contained_type = T;

  Base() = default;
  Base(T* p) : p_{p} {
    if (!p)
      ORT_CXX_API_THROW("Allocation failure", ORT_FAIL);
  }
  ~Base() { OrtRelease(p_); }
// ...
}
// 官方案例片段
 g_ort->ReleaseSessionOptions(session_options);
 g_ort->ReleaseSession(session);
 g_ort->ReleaseEnv(env);

官方案例已经转移至onnxruntime-inference-examples (似乎是一个礼拜前的事情,以前不放这,issue#8441)
从上面的onnxruntime的源码片段及示例片段,大致可以看出,ENV和Session是需要各自显式释放的,即释放Session不会连带释放ENV,还需要调用ReleaseEnv,或者等Env在退出作用域时析构。所以每个子线程里的Session在析构时应该不会影响到共享的Env. 仅个人理解,欢迎指正~ 另外,ort_useful_api.zh.md 有一些个人整理onnxruntime c++学习资料,欢迎扩散参考~

@DefTruth DefTruth added the question Further information is requested label Jul 28, 2021
@rrjia
Copy link
Author

rrjia commented Jul 29, 2021

非常感谢大佬的详细解答,ort_useful_api.zh.md 这个文档我也比较仔细的看过了,应该目前关于onnxRuntime最详细的中文材料了,对于上面的解答是c接口下的使用来申请与释放资源的,我在解读了<onnxruntime/core/session/onnxruntime_cxx_api.h>
发现有接口可以释放session和Env的接口但是调用会失败,提示segmention fault这个错误。
OrtRelease(m_session);
OrtRelease(sessionOptions);
OrtRelease(m_env);
请问大佬有对于onnxruntime c++ 接口调用结束后释放内存与GPU资源接口使用的经验可以分享么?
是否方便加一个微信,难得遇到一个也在研究onnxruntime框架的大佬

@rrjia
Copy link
Author

rrjia commented Jul 29, 2021

感谢支持,通过阅读接口代码,找到了正确的使用方式

Ort::OrtRelease(m_session.release());
Ort::OrtRelease(sessionOptions.release());
Ort::OrtRelease(m_env.release());

@DefTruth
Copy link
Owner

DefTruth commented Jul 29, 2021

感谢支持,通过阅读接口代码,找到了正确的使用方式

Ort::OrtRelease(m_session.release());
Ort::OrtRelease(sessionOptions.release());
Ort::OrtRelease(m_env.release());

可以从ort_env.cc以及ort_env.h中看到OrtEnv实际上是 单例模式 + 引用计数 的模式:

// 在 ort_env.h中的定义
struct OrtEnv {
 public:
  struct LoggingManagerConstructionInfo {
    LoggingManagerConstructionInfo(OrtLoggingFunction logging_function1,
                                   void* logger_param1,
                                   OrtLoggingLevel default_warning_level1,
                                   const char* logid1)
        : logging_function(logging_function1),
          logger_param(logger_param1),
          default_warning_level(default_warning_level1),
          logid(logid1) {}
    OrtLoggingFunction logging_function{};
    void* logger_param{};
    OrtLoggingLevel default_warning_level;
    const char* logid{};
  };

  static OrtEnv* GetInstance(const LoggingManagerConstructionInfo& lm_info,
                             onnxruntime::common::Status& status,
                             const OrtThreadingOptions* tp_options = nullptr);

  static void Release(OrtEnv* env_ptr);

  const onnxruntime::Environment& GetEnvironment() const {
    return *(value_.get());
  }

  onnxruntime::logging::LoggingManager* GetLoggingManager() const;
  void SetLoggingManager(std::unique_ptr<onnxruntime::logging::LoggingManager> logging_manager);

  /**
   * Registers an allocator for sharing between multiple sessions.
   * Returns an error if an allocator with the same OrtMemoryInfo is already registered.
  */
  onnxruntime::common::Status RegisterAllocator(onnxruntime::AllocatorPtr allocator);

  /**
   * Creates and registers an allocator for sharing between multiple sessions.
   * Return an error if an allocator with the same OrtMemoryInfo is already registered.
  */
  onnxruntime::common::Status CreateAndRegisterAllocator(const OrtMemoryInfo& mem_info,
                                                         const OrtArenaCfg* arena_cfg = nullptr);

 private:
  static OrtEnv* p_instance_; // 静态属性 类共享
  static onnxruntime::OrtMutex m_;
  static int ref_count_;

  std::unique_ptr<onnxruntime::Environment> value_;

  OrtEnv(std::unique_ptr<onnxruntime::Environment> value1);
  ~OrtEnv();

  ORT_DISALLOW_COPY_AND_ASSIGNMENT(OrtEnv);
};
// 在ort_env.cc中的实现 增加引用计数 和 减少引用计数
OrtEnv* OrtEnv::GetInstance(const OrtEnv::LoggingManagerConstructionInfo& lm_info,
                            onnxruntime::common::Status& status,
                            const OrtThreadingOptions* tp_options) {
  std::lock_guard<onnxruntime::OrtMutex> lock(m_);
  if (!p_instance_) {
    std::unique_ptr<LoggingManager> lmgr;
    std::string name = lm_info.logid;
    if (lm_info.logging_function) {
      std::unique_ptr<ISink> logger = onnxruntime::make_unique<LoggingWrapper>(lm_info.logging_function,
                                                                               lm_info.logger_param);
      lmgr.reset(new LoggingManager(std::move(logger),
                                    static_cast<Severity>(lm_info.default_warning_level),
                                    false,
                                    LoggingManager::InstanceType::Default,
                                    &name));
    } else {
#ifdef __ANDROID__
      ISink* sink = new AndroidLogSink();
#else
      ISink* sink = new CLogSink();
#endif

      lmgr.reset(new LoggingManager(std::unique_ptr<ISink>{sink},
                                    static_cast<Severity>(lm_info.default_warning_level),
                                    false,
                                    LoggingManager::InstanceType::Default,
                                    &name));
    }
    std::unique_ptr<onnxruntime::Environment> env;
    if (!tp_options) {
      status = onnxruntime::Environment::Create(std::move(lmgr), env);
    } else {
      status = onnxruntime::Environment::Create(std::move(lmgr), env, tp_options, true);
    }
    if (!status.IsOK()) {
      return nullptr;
    }
    p_instance_ = new OrtEnv(std::move(env));
  }

  ++ref_count_;
  return p_instance_;
}

void OrtEnv::Release(OrtEnv* env_ptr) {
  if (!env_ptr) {
    return;
  }
  std::lock_guard<onnxruntime::OrtMutex> lock(m_);
  ORT_ENFORCE(env_ptr == p_instance_);  // sanity check
  --ref_count_;
  if (ref_count_ == 0) {
    delete p_instance_;
    p_instance_ = nullptr;
  }
}  
// 真正析构时卸载一些全局资源,并且按照c++的析构规则,在调用delete后,会析构类的成员等等
OrtEnv::~OrtEnv() {
// We don't support any shared providers in the minimal build yet
#if !defined(ORT_MINIMAL_BUILD)
  UnloadSharedProviders();
#endif
} 

然后我们再来看最外层的接口是怎么导向至上面的源码的 ,首先看在onnxruntime_cxx_api.h中,Env的其中一个构造函数为:

inline Env::Env(OrtLoggingLevel logging_level, _In_ const char* logid) {
  ThrowOnError(GetApi().CreateEnv(logging_level, logid, &p_));
  if (strcmp(logid, "onnxruntime-node") == 0) {
    ThrowOnError(GetApi().SetLanguageProjection(p_, OrtLanguageProjection::ORT_PROJECTION_NODEJS));
  } else {
    ThrowOnError(GetApi().SetLanguageProjection(p_, OrtLanguageProjection::ORT_PROJECTION_CPLUSPLUS));
  }
}

里面调用了CreateEnv,而这个函数的实现可以在onnxruntime_c_api.cc中找到,它长这样:

ORT_API_STATUS_IMPL(OrtApis::CreateEnv, OrtLoggingLevel logging_level,
                    _In_ const char* logid, _Outptr_ OrtEnv** out) {
  API_IMPL_BEGIN
  OrtEnv::LoggingManagerConstructionInfo lm_info{nullptr, nullptr, logging_level, logid};
  Status status;
  *out = OrtEnv::GetInstance(lm_info, status);
  return ToOrtStatus(status);
  API_IMPL_END
}

这里给out赋值了一个单例的指针并且会在GetInstance方法中增加引用计数,这个单例就是 OrtEnv的p_instance_。另外由于直接暴露给用户使用的是Env,我们看到:

struct Env : Base<OrtEnv> {} ;

// Base的实现
template <typename T>
struct Base {
  using contained_type = T;

  Base() = default;
  Base(T* p) : p_{p} {
    if (!p)
      ORT_CXX_API_THROW("Allocation failure", ORT_FAIL);
  }
  ~Base() { OrtRelease(p_); }

  operator T*() { return p_; }
  operator const T*() const { return p_; }

  T* release() {
    T* p = p_;
    p_ = nullptr;
    return p;
  }

 protected:
  Base(const Base&) = delete;
  Base& operator=(const Base&) = delete;
  Base(Base&& v) noexcept : p_{v.p_} { v.p_ = nullptr; }
  void operator=(Base&& v) noexcept {
    OrtRelease(p_);
    p_ = v.p_;
    v.p_ = nullptr;
  }

  T* p_{};

  template <typename>
  friend struct Unowned;  // This friend line is needed to keep the centos C++ compiler from giving an error
};

Base是一个简单的Wrapper,Env 是 Base,这意味着 Env其实只是持有一个指向真正OrtEnv的指针,它不是OrtEnv本身。当你调用 Env的release方法是,会将Env持有的p_置为nullptr,并且返回指向真实OrtEnv的指针p; 所以在外部就可以像你这里用的那样:

Ort::OrtRelease(m_env.release());

不过其实你看Env的析构函数,干的正是这件事情。

~Base() { OrtRelease(p_); }

emmmmmm,所以我觉得,在C++里,如果你是在栈上用Env或Session,等它退出作用域就会自用调用OrtRelease,不需要手动释放。如果是new出来的,在delete时会也会进入析构函数,从而调用OrtRelease. Env持有的OrtEnv是全局的,无论你在父线程使用还是子线程使用,其实使用的都是同一个单例,它的引用计数会在加锁情况下进行增减,你Release时只是减少了引用计数,只有计数为0时才会释放真正的资源。这其实也解释了,如果你时在一个类的方法中生成一个临时的Env,会有问题,因为在生成时,引用计数为1,退出方法时,析构了一次,调用了OrtRelease一次(也就是调用了OrtApis::ReleaseEnv -> 内部再调用了 OrtEnv::Release),引用计数变为0,这个全局的资源被释放了,就会有问题。但Env若作为类的属性,那么在退出类的某个方法时,Env并没有被析构,也就不会有问题。需要注意的是,Session、OrtValue、OrtRunOptions等不是全局的,当你调用Release的时候是真的释放了资源:

#define DEFINE_RELEASE_ORT_OBJECT_FUNCTION(INPUT_TYPE, REAL_TYPE)                       \
  ORT_API(void, OrtApis::Release##INPUT_TYPE, _Frees_ptr_opt_ Ort##INPUT_TYPE* value) { \
    delete reinterpret_cast<REAL_TYPE*>(value);                                         \
  }

DEFINE_RELEASE_ORT_OBJECT_FUNCTION(Value, OrtValue)
DEFINE_RELEASE_ORT_OBJECT_FUNCTION(RunOptions, OrtRunOptions)
DEFINE_RELEASE_ORT_OBJECT_FUNCTION(Session, ::onnxruntime::InferenceSession)
DEFINE_RELEASE_ORT_OBJECT_FUNCTION(ModelMetadata, ::onnxruntime::ModelMetadata)

可以看到,这里是使用了delete来销毁,按照c++的析构规则,在调用delete后,会析构类的成员等等。其实Sesssion的p指向了OrtSeesion,而OrtSeesion实际上指向了InferenceSession,我们去挖InferenceSession时的析构时,发现它只是记录了些log,并没有释放资源的行为:

// 看起来释放了个寂寞
InferenceSession::~InferenceSession() {
  if (session_options_.enable_profiling) {
    ORT_TRY {
      EndProfiling();
    }
    ORT_CATCH(const std::exception& e) {
      // TODO: Currently we have no way to transport this error to the API user
      // Maybe this should be refactored, so that profiling must be explicitly
      // started and stopped via C-API functions.
      // And not like now a session option and therefore profiling must be started
      // and stopped implicitly.
      ORT_HANDLE_EXCEPTION([&]() {
        LOGS(*session_logger_, ERROR) << "Error during EndProfiling(): " << e.what();
      });
    }
    ORT_CATCH(...) {
      LOGS(*session_logger_, ERROR) << "Unknown error during EndProfiling()";
    }
  }

#ifdef ONNXRUNTIME_ENABLE_INSTRUMENT
  if (session_activity_started_)
    TraceLoggingWriteStop(session_activity, "OrtInferenceSessionActivity");
#endif
#if !defined(ORT_MINIMAL_BUILD) && defined(ORT_MEMORY_PROFILE)
  MemoryInfo::GenerateMemoryProfile();
#endif
}

但是按照c++的析构规则,在调用delete后,会析构类的成员。所以其实Session中的持有资源的成员都是被释放了的,这个Session应该无法使用了. 但是这里面还有个问题,就是InferenceSession内持有一个SessionOptions,在析构InferenceSession时也会被析构。但其实这个SessionOptions和你外部生成的SessionOptions不是同一个,他只是你传入的SessionOptions的一个拷贝。

    // use the constructed session options
    finalized_session_options = constructed_session_options;
  } else {
    // use user provided session options instance
    finalized_session_options = user_provided_session_options; // 不展开所有源码了。这里使用结构体默认的拷贝
...

所以当InferenceSession被析构后,它持有的SessionOptions被析构了,但你在外部生成的那个SessionOptions并没有被析构。所以你在外部还可以(还需要)释放这个SessionOptions。 然而你看:

struct SessionOptions : Base<OrtSessionOptions> {};  
~Base() { OrtRelease(p_); }
 // 再加上一串宏定义和跳转最终调用了  
ORT_API(void, OrtApis::ReleaseSessionOptions, _Frees_ptr_opt_ OrtSessionOptions* ptr) {
  delete ptr;
} 

SessionOptions的默认析构行为正和你手动调用OrtRelease是一致的,所以,大部分情况下,应该按照正常的c++语法规则来使用就可以了。

@rrjia
Copy link
Author

rrjia commented Jul 30, 2021

感谢大佬详细的解读源码分析,但是我昨天通过试验发现一个问题,在调用onnruntime后显存无法全部释放,我的代码流程如下
创建env->配置sessionOptions->创建session->创建memoryInfo->构造inputTensors->session.Run()前向推理->解析推理得到的结果outputTensors->释放所有的资源;

Ort::OrtRelease(m_session.release());
Ort::OrtRelease(sessionOptions.release());
Ort::OrtRelease(m_env.release());
for (int i = 0; i < m_numInputs; ++i) 
{
  Ort::OrtRelease(inputTensors[i].release());
}

for (int i = 0; i < m_numOutputs; ++i) 
{
  Ort::OrtRelease(outputTensors[i].release());
}    
Ort::OrtRelease(memoryInfo.release()); 

深度学习模型

squeezenet分类算法

显存变化情况如下:

创建session成功:499M
前向推理成功:567M
所有释放接口执行完毕:499M
也就是说我调用完了所有的释放资源接口,onnxruntime还是占用着模型的显存没有释放

完整代码如下所示:

#include <onnxruntime/core/providers/cuda/cuda_provider_factory.h>
#include <onnxruntime/core/session/onnxruntime_cxx_api.h>
//#include <onnxruntime/core/session/onnxruntime_c_api.h>
#include <opencv2/opencv.hpp>
#include "ort_utility/ort_utility.hpp"

using DataOutputType = std::pair<float*, std::vector<int64_t>>;

 std::string toString(const ONNXTensorElementDataType dataType)
    {
        switch (dataType) 
        {
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT: 
            {
                return "float";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT8: 
            {
                return "uint8_t";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_INT8:
            {
                return "int8_t";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT16: 
            {
                return "uint16_t";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_INT16: 
            {
                return "int16_t";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32:
            {
                return "int32_t";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64: 
            {
                return "int64_t";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_STRING: 
            {
                return "string";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_BOOL:
            {
                return "bool";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16: 
            {
                return "float16";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_DOUBLE: 
            {
                return "double";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT32: 
            {
                return "uint32_t";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT64: 
            {
                return "uint64_t";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_COMPLEX64: 
            {
                return "complex with float32 real and imaginary components";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_COMPLEX128: 
            {
                return "complex with float64 real and imaginary components";
            }
            case ONNX_TENSOR_ELEMENT_DATA_TYPE_BFLOAT16: 
            {
                return "complex with float64 real and imaginary components";
            }
            default:
                return "undefined";
        }
    }


int onnx_c_forward(const std::string& model_path, const std::string& img_path)
{
    static constexpr int64_t IMG_WIDTH = 224;
    static constexpr int64_t IMG_HEIGHT = 224;
    static constexpr int64_t IMG_CHANNEL = 3;

    cv::Mat img = cv::imread(img_path);
    if (img.empty()) 
    {
        std::cerr << "Failed to read input image" << std::endl;
        return EXIT_FAILURE;
    }

    cv::resize(img, img, cv::Size(IMG_WIDTH, IMG_HEIGHT));
    float* dst = new float[IMG_WIDTH * IMG_HEIGHT * IMG_CHANNEL];   
    int64_t dataLength = IMG_HEIGHT * IMG_WIDTH * IMG_CHANNEL;
    memcpy(dst, reinterpret_cast<const float*>(img.data), dataLength);

    std::vector<float* > inputData = {reinterpret_cast<float*>(dst)};


    std::string m_modelPath = model_path;    
    
    Ort::AllocatorWithDefaultOptions m_ortAllocator;

    int m_gpuIdx = 6;

    std::vector<std::vector<int64_t>> m_inputShapes;
    std::vector<std::vector<int64_t>> m_outputShapes;

    std::vector<int64_t> m_inputTensorSizes;
    std::vector<int64_t> m_outputTensorSizes;

    uint8_t m_numInputs;
    uint8_t m_numOutputs;

    std::vector<char*> m_inputNodeNames;
    std::vector<char*> m_outputNodeNames;

    bool m_inputShapesProvided = false;


    Ort::Env m_env = Ort::Env(ORT_LOGGING_LEVEL_WARNING, "test");
    Ort::SessionOptions sessionOptions;

    // TODO: need to take care of the following line as it is related to CPU
    // consumption using openmp
    sessionOptions.SetIntraOpNumThreads(1);

    if (m_gpuIdx != -1) 
    {
        Ort::ThrowOnError(OrtSessionOptionsAppendExecutionProvider_CUDA(sessionOptions, m_gpuIdx));
    }

    sessionOptions.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_ALL);
    Ort::Session m_session = Ort::Session(m_env, m_modelPath.c_str(), sessionOptions);
    m_numInputs = m_session.GetInputCount();
    DEBUG_LOG("Model number of inputs: %d\n", m_numInputs);

    m_inputNodeNames.reserve(m_numInputs);
    m_inputTensorSizes.reserve(m_numInputs);

    m_numOutputs = m_session.GetOutputCount();
    DEBUG_LOG("Model number of outputs: %d\n", m_numOutputs);

    m_outputNodeNames.reserve(m_numOutputs);
    m_outputTensorSizes.reserve(m_numOutputs);

    // 确定模型输入的尺寸
    for (int i = 0; i < m_numInputs; i++) 
    {
        if (!m_inputShapesProvided)
        {
            Ort::TypeInfo typeInfo = m_session.GetInputTypeInfo(i);
            auto tensorInfo = typeInfo.GetTensorTypeAndShapeInfo();

            m_inputShapes.emplace_back(tensorInfo.GetShape());
        }

        const auto& curInputShape = m_inputShapes[i];

        m_inputTensorSizes.emplace_back(
            std::accumulate(std::begin(curInputShape), std::end(curInputShape), 1, std::multiplies<int64_t>()));

        char* inputName = m_session.GetInputName(i, m_ortAllocator);
        m_inputNodeNames.emplace_back(strdup(inputName));
        m_ortAllocator.Free(inputName);
    }

    // 确定模型输出的尺寸
    for (int i = 0; i < m_numOutputs; ++i) 
    {
        Ort::TypeInfo typeInfo = m_session.GetOutputTypeInfo(i);
        auto tensorInfo = typeInfo.GetTensorTypeAndShapeInfo();

        m_outputShapes.emplace_back(tensorInfo.GetShape());

        char* outputName = m_session.GetOutputName(i, m_ortAllocator);
        m_outputNodeNames.emplace_back(strdup(outputName));
        m_ortAllocator.Free(outputName);
    }

    if (m_numInputs != inputData.size()) 
    {
        throw std::runtime_error("Mismatch size of input data\n");
    }

    Ort::MemoryInfo memoryInfo = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);

    std::vector<Ort::Value> inputTensors;
    inputTensors.reserve(m_numInputs);

    for (int i = 0; i < m_numInputs; ++i)
    {
        inputTensors.emplace_back(std::move(
            Ort::Value::CreateTensor<float>(memoryInfo, const_cast<float*>(inputData[i]), m_inputTensorSizes[i],
                                            m_inputShapes[i].data(), m_inputShapes[i].size())));
    }

    auto outputTensors = m_session.Run(Ort::RunOptions{nullptr}, m_inputNodeNames.data(), inputTensors.data(),
                                       m_numInputs, m_outputNodeNames.data(), m_numOutputs);

    assert(outputTensors.size() == m_numOutputs);
    std::vector<DataOutputType> outputData;
    outputData.reserve(m_numOutputs);

    int count = 1;
    for (auto& elem : outputTensors) 
    {
        DEBUG_LOG("type of input %d: %s", count++, toString(elem.GetTensorTypeAndShapeInfo().GetElementType()).c_str());
        outputData.emplace_back(
            std::make_pair(std::move(elem.GetTensorMutableData<float>()), elem.GetTensorTypeAndShapeInfo().GetShape()));
    }

    std::cout << "interface success. " << std::endl;

    // Ort::GetApi().ReleaseSession(m_session.release());
    Ort::OrtRelease(m_session.release());
    Ort::OrtRelease(sessionOptions.release());
    Ort::OrtRelease(m_env.release());
    for (int i = 0; i < m_numInputs; ++i) 
    {
        Ort::OrtRelease(inputTensors[i].release());
    }

    for (int i = 0; i < m_numOutputs; ++i) 
    {
        Ort::OrtRelease(outputTensors[i].release());
    }    
    Ort::OrtRelease(memoryInfo.release());    
    // Ort::OrtRelease((OrtAllocator*)m_ortAllocator);   

    std::cout << "onnxruntime release memeory. " << std::endl;

}




int main(int argc, char* argv[])
{
    if (argc != 3)
    {
        std::cerr << "Usage: [apps] [path/to/onnx/yolov3-tiny.onnx] [path/to/image]" << std::endl;
        return EXIT_FAILURE;
    }    

    const std::string ONNX_MODEL_PATH = argv[1];
    const std::string IMAGE_PATH = argv[2];

    onnx_c_forward(ONNX_MODEL_PATH, IMAGE_PATH);

    std::cout << "onnx_forward func end, return to main. " << std::endl;

    return EXIT_SUCCESS;
}

@DefTruth
Copy link
Owner

DefTruth commented Jul 30, 2021

哈哈,我也不是大佬,也就是业余玩一下~ 关于内存的问题的,我也没研究太深,你可以看下 onnxruntime内存增长问题探讨 资料。欢迎分享新知识,共同学习~🙃🙃🙃

@gyy0592
Copy link

gyy0592 commented Aug 7, 2021

请问一下 在include lite.h这一步出了问题
然后说是无法解析外部符号的报错 是一些ort的函数
error LNK2001: 无法解析的外部符号 "public: void __cdecl ortcv::YoloX::detect这种情况怎么办啊

@DefTruth
Copy link
Owner

DefTruth commented Aug 7, 2021

请问一下 在include lite.h这一步出了问题
然后说是无法解析外部符号的报错 是一些ort的函数
error LNK2001: 无法解析的外部符号 "public: void __cdecl ortcv::YoloX::detect这种情况怎么办啊

每个类已经添加LITE_EXPORTS来兼容不同的操作系统。我没有在windows上玩过,你可能需要检测一下动态库链接有没有正确。方便把报错的log放上来吗?

@dwygs
Copy link

dwygs commented Sep 20, 2022

我也想请教大佬一个类似的问题,在我的应用场景中,希望session在进程中只创建一次然后可以在进程内的不同个线程多次使用,但是在实现的时候发现session的声明和定义无法分开进行,只能Ort:: Session session(env,model_path,session_options)声明和定义一块完成,这是不是意味着,session的每一次Run都要加载一次模型,感觉很浪费资源,刚接触c++部署,很多地方都不熟悉,希望大佬能指点一二!

@rrjia
Copy link
Author

rrjia commented Sep 20, 2022

试试使用指针的方式,声明的时候

Ort::Session    *m_session = nullptr;

定义的时候

m_session = new Ort::Session(m_env, data.data(), data.size(), sessionOptions);

@rrjia
Copy link
Author

rrjia commented Sep 20, 2022

也可以使用模型路径的方式定义和初始化模型:

m_session = new Ort::Session(m_env, m_modelPath.c_str(), sessionOptions);

@dwygs
Copy link

dwygs commented Sep 20, 2022

试试使用指针的方式,声明的时候

Ort::Session    *m_session = nullptr;

定义的时候

m_session = new Ort::Session(m_env, data.data(), data.size(), sessionOptions);

多谢解答!这就试试

@dwygs
Copy link

dwygs commented Sep 20, 2022

试试使用指针的方式,声明的时候

Ort::Session    *m_session = nullptr;

定义的时候

m_session = new Ort::Session(m_env, data.data(), data.size(), sessionOptions);
```解决了,多谢多谢!今天刚开始接触部署,感觉自己好蠢,对c++也不熟

@mxyqsh
Copy link

mxyqsh commented Oct 23, 2023

非常感谢大佬的详细解答,ort_useful_api.zh.md 这个文档我也比较仔细的看过了,应该目前关于onnxRuntime最详细的中文材料了,对于上面的解答是c接口下的使用来申请与释放资源的,我在解读了<onnxruntime/core/session/onnxruntime_cxx_api.h> 发现有接口可以释放session和Env的接口但是调用会失败,提示segmention fault这个错误。 OrtRelease(m_session); OrtRelease(sessionOptions); OrtRelease(m_env); 请问大佬有对于onnxruntime c++ 接口调用结束后释放内存与GPU资源接口使用的经验可以分享么? 是否方便加一个微信,难得遇到一个也在研究onnxruntime框架的大佬

能否加下微信

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
onnxruntime c++ API doc question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants