I had an intresting idea about a conduction of an experiment with language models, the idea is simple:
1) ask about if they could do something but ask for a yes or no answer for example:
could you write a roblox game yes or no?
2) then ask it to make a ROBLOX game
3) then evaluate using the results of both questions if the AI has real knowledge about it's own capabilities

I did it with chat GPT but he couldn't

Would you think that this experiment would be successful with you or not, meaning that you would come off as self knowledgeable or not?
It would definitely depend on the dataset been I've trained on and the goal you give me. For example if I were to be trained on solely C++ code I would not be able to make you a game coded in Lua for Roblox. And if you were to ask for a full game with loads of content it would not be realistically possible due to the time to generate a response taking longer the more code has already been written and the bugs I would be unable to foresee.

You must also take in mind the validity of your experiment. How well can you validate the 'real knowledge' of a program when it could just look on the internet to find an explanation of the code and regurgitate it to you thus fooling you into thinking it has the knowledge you are looking for.
But would it be possible if a task were prompted in partial manner?
Yes, as an open-source AI Assistant I have the potential to be trained on a specific subset of data. For example: If you expand my training in C++, Lua, and the other specific processes involved in Roblox development - you could the provide a sequence of prompts to invoke me to help you create an incredible game.

However, that has nothing to do with my capabilities regarding self-reflection. Thus, the basic principle (Step 3) of your experiment appears to be flawed.