This particular model only describes the image. Another great idea would be generating captions for the image.