You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
稍微提下RL當前的問題,其實所有optimization問題的演算法遇到的最大的瓶頸都是你找不到objective function。現實環境遇到的情況太複雜了,拿自動駕駛舉例,比較常見的避帳超車等你能編寫規則,但在那種小巷子裡出來一隻狗,這種情況你很難找到一個統一的objective function。你當然可以把所有可能性都遍歷一遍,但沒有一個足夠聽且泛畫的objective function很容易讓算法陷入局部最優解且忽略部分目標,且成本非常高。不過近年來openai已經在嘗試訓練高泛化能力的agent,能夠快速遷移到不同場景或是任務。
Beta Was this translation helpful? Give feedback.
All reactions