The object detection and Recognition is very important in Aerial Surveillance application. Aerial Surveillance through unmanned aerial vehicles allow us to detect and recognize the objects. However, most of the approaches allow us to track the objects and recognize them that rely on deletion of background to provide accurate detection's. The existing object detection in aerial imaginary i.e., principal component analysis suffers from low accuracy and high computational load. In building an intelligent at Aerial Surveillance system through object detection and tracking methods for the aerial sensing images dataset is taken from Pedestrie . It is difficult to track if Unmanned aerial vehicle slow down which is not acceptable for object detection. However, this requires analyzing algorithm to detect objects. The main problem is to analyze the aerial Images due to high altitude, rough background.The dataset may contain various degradation factors which would decrease the quality of data.The degradation factors are minimized. The RetinaNet is used for Object detection.The RetinaNet consists of two sub-networks as backbone.The Feature Pyramid Net is used to draw the Convolutional map over the image.The main role of these sub networks are used to draw anchor tags and localize the pedestrian in the given instance of image.The Intersection Over Union(IOU) is used track the pedestrian.The IOU takes the input from the detection step and apply tracking frame by frame. The Degradation Factor’s such as UAV Elevation angle,Motion Blur,Pose and Lightning Shadows would reduce the Quality of data.When ever the angle is very close to 90˚ then the surface would parallel to object.In that case only the top surface would visible.There are chances that shadows can be detect as Pedestrian.In order to over come them 3D Depth maps are used.The 3D Depth maps are estimated and features are extracted and they help to determine only the Pedestrian. 1.Pedestrian detection : The Pedestrian detection involve in localizing the person in the given instance of class. RetinaNet is used for pedestrian detection.The RetinaNet is considered as baseline for pedestrian detection.The main reason for choosing the retina net as baseline because of the results obtained from the retina net is easily conceptualized.RetinaNet consists of two sub networks as backbone they are Feature Pyramid Net and Resnet50.The RetinaNet uses the Feature Pyramid Net as Sub network and it is applied on the entire image to obtain Convolutional feature map over the image.The two sub-networks help in A.Assign the anchor boxes B.Classify the bounding box for localizing the person in given image
Fig Object Detection
2.Pedestrian tracking :
The Pedestrian tracking involve in tracking the object in given image.The IOU is used for the Pedestrian tracking.In order to track the pedestrian the IOU was initially selected because we can attain fair evaluation between dataset.Intersection over Union is most widely used for tracking purpose.The IOU was initially developed based on mainly two assumptions
A.the detection step returns a detection in line with body for each item to be tracked.
B.the objects in consecutive frames have excessive overlap
So,based on these assumption the Intersection Over Union is built and it is used to track pedestrian without any information about the image.Due to this the IOU is considered as Computational effective.

Fig Intersection Over Union
3.3D depth map : The 3D Depth maps are initially developed to provide the accurate results in the given class of image.In general the 3D Depth map contain all crucial information that differ pedestrian from other objects in surroundings.The height feature in the image are obtained from the depth maps are fused with image feature representation by the detection step this will help in improving the performance.However these 3D Depth maps play a crucial role in surveillance system. In order to add 3D Depth features to Detection layer.Theses 3D Depth maps are initially generated by the pair of orthogonal aerial image


