Can you write the following in a better way? After that, can you point out some mistakes, if there are any? 

In policy gradient methods the agent is composed of two components, an actor and a critic. The actor represents the policy and is usually implemented by a with a neural network. The actor takes as input the agent's observation and outputs a probability distribution over the action space. The critic estimates the value function of the state and usually has the same neural network architecture as the actor but outputs one real value, corresponding the value of the state.
In policy gradient methods, the agent consists of two components: an actor and a critic. The actor represents the policy and is typically implemented using a neural network. It takes the agent’s observation as input and outputs a probability distribution over the action space. The critic estimates the value function of the state and usually has the same neural network architecture as the actor. However, it outputs a single real value that corresponds to the value of the state.

One mistake in the original text is that it says “implemented by a with a neural network” instead of “implemented using a neural network”.