Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rotate camera orientation #5

Open
hoangdado opened this issue Jul 28, 2016 · 33 comments
Open

Rotate camera orientation #5

hoangdado opened this issue Jul 28, 2016 · 33 comments

Comments

@hoangdado
Copy link

I had tried to set the camera orientation to Landscape or Portrait but the below code (in DlibWrapper.mm) still return width = 640 and height = 480 (with preset is AVCaptureSessionPreset640x480).
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
Then I couldn't do the landmark detection in Portrait view. Could you fix it?

@stanchiang
Copy link

stanchiang commented Aug 7, 2016

^bump

same issue, more or less. i updated my plist to show portrait w/ home button at bottom (normal orientation) and attached the result. the "face landmark mask" is drawn upright but, the camera feed is rotated landscape.

@zweigraf maybe you could find some time to document how the orientation gets determined/configured? I've rotated AVCaptureVideoPreviewLayer before but rotating AVSampleBufferDisplayLayer doesn't seem possible since i couldn't find examples off google.

img_3615

@hoangdado
Copy link
Author

@stanchiang I found that if you change the camera orientation to portrait or landscape the values still are width = 640 and height = 480. I think it maybe the camera hardware will alsway output the buffer with this size. What you can do is rotating the image when copying pixel value from CVPixelBuffer to dlib::array2ddlib::bgr_pixel. In addition you need to rotate the input face rect.
I fixed the issue by doing that, but I got the performance problem. The doWorkOnSampleBuffer method consume too much CPU.

@stanchiang
Copy link

can you add some sample code for your implementation? i was trying to do something like that but it wasn't working right.

@hoangdado
Copy link
Author

hoangdado commented Aug 7, 2016

For copying pixel values:

    img.set_size(width, height);
    img.reset();
    long position = 0;

    while (img.move_next()) {
        lib::bgr_pixel& pixel = img.element();

        size_t row = position / height;
        size_t col = position % height;

        long bufferLocation = (col * width + row) * 4;

        char b = baseBuffer[bufferLocation];
        char g = baseBuffer[bufferLocation + 1];
        char r = baseBuffer[bufferLocation + 2];

        dlib:: bgr_pixel newpixel(b, g, r);
        pixel = newpixel;

        position++;
    }

For rotate face rect, I think you should code yourself for convenience. You only need to change the oneFaceRect.

@realmosn
Copy link

realmosn commented Aug 8, 2016

@hoangdado this code doesn't work for me, have you tested this ?

@teresakozera
Copy link

Well, I also faced this problem. I didn't solve it but moved a little bit. Firstly, in the Target -> General:
screen shot 2016-08-08 at 16 27 43

Next thing- 'SessionHandler.swift':

`func captureOutput(captureOutput: AVCaptureOutput!, didOutputSampleBuffer sampleBuffer: CMSampleBuffer!, fromConnection connection: AVCaptureConnection!) {

    connection.videoOrientation = AVCaptureVideoOrientation.Portrait

`

And the outcome is:
img_0130

As you can see the landmarks are very distorted.

I hope someone will find it helpful and share the solution. :)

@hoangdado
Copy link
Author

@teresakozera I solved your problem. You only need to update convertScaleCGRect method as bellow:

    long right = (1.0 - rect.origin.y ) * size.width;
    long left = right - rect.size.height * size.width;
    long top = rect.origin.x * size.height;
    long bottom = top + rect.size.width * size.height;

@stanchiang You can follow this solution. It is much easier than that I recommend you before. View my fork project for source code https://github.com/hoangdado/face-landmarking-ios

Notice: With my fix, I don't know why the mouth landmarks is not exactly correct while the others is perfect!

@realmosn
Copy link

realmosn commented Aug 8, 2016

@hoangdado That fixed the issue, Thank you!
My issue is the landmarks are not so accurate for the mouth and around the face. were you able to fix it ?

@stanchiang
Copy link

@hoangdado thanks that helped a lot!

I also had to do an affine transformation on the layer so that the output isn't mirrored the opposite way with:

    layer.setAffineTransform(CGAffineTransformMakeRotation(CGFloat(M_PI)))
    layer.setAffineTransform(CGAffineTransformScale(layer.affineTransform(), 1, -1))

@teresakozera
Copy link

teresakozera commented Aug 9, 2016

@hoangdado, thank you! it works perfect, no distortion- even in the mouth region. :)

previously I tried to manipulate convertScaleCGRect method, but I was scaling and changing the parameters instead of thinking of any kind of subtraction...

@stanchiang
Copy link

@teresakozera for me the distorting is more of a stability issue when trying to maintain the tracking as the tolerance for difference angled faces seems to have gone down a bit for me when I try moving my face and the mask gets jittery.

Am I facing a different issue than you guys?

@teresakozera
Copy link

@stanchiang I also observed this problem, but it also existed previously, with the ladndscape orientation. In my case it's not that big of an issue as I need it mostly in the direct position of the head towards camera. But I will also try to fix it- if I succeed I will certainly let you know. :)

@realmosn
Copy link

@teresakozera something off topic, could you please tell me how you got the landmarking lines working ? all I see in the app is the dots

thanks

@stanchiang
Copy link

@ArtSebus probably just used the function dlib::draw_line(img, <#const point &p1#>, <#const point &p2#>, <#const pixel_type &val#>);

@realmosn
Copy link

@stanchiang could you please suggest what should I pass in for the parameters
<#const point &p1#>, <#const point &p2#>, <#const pixel_type &val#>)
Sorry I am not that good with C programming

@stanchiang
Copy link

@ArtSebus haven't touched c in a few years myself haha. bit it looks like you's need to pass in a couple dlib::point that you want to connect and then specify what type of line you want to draw for the last one. I'd try inputing 3 for the value. No reason for that number, its just the same number that was used when drawing the dots in the existing code.

@stanchiang
Copy link

@teresakozera trying something a little different right now. i'm storing shape.parts[60-67] which make up the mouth in a separate array and trying to pass it into UIKit/SceneKit to draw it separately.

[m addObject: [NSValue valueWithCGPoint:CGPointMake( [DlibWrapper pixelToPoints:p.x()], [DlibWrapper pixelToPoints:p.y()]) ]];

converting from pixels to points using this function https://gist.github.com/jordiboehmelopez/3168819

The problem it still seems stuck in the old bounds. sort of like the old screenshot i posted. I wasn't expecting this problem because we call convertCGRectValueArray before we loop through the shape.part array

@teresakozera
Copy link

teresakozera commented Aug 11, 2016

@ArtSebus line method, have a look at this: https://github.com/chili-epfl/attention-tracker/blob/master/README.md :)

@stanchiang - hmmm... a little bit odd. So with code from here and all the changes it works but when you try the above (a conversion from pixels to points) it displays landmarks in the other orientation? Does it happen after it is passed to UIKit or you check it before?

@stanchiang
Copy link

stanchiang commented Aug 11, 2016

@teresakozera - solved the transformation issue. is was my own fault. but now i noticed there is an issue where my cgpoint coordinates have a weird offset for some reason.

for example in my gamescene.swift file i had to add center = CGPointMake(center.x+50,center.y-100)

here's my code to show you what i mean
https://github.com/stanchiang/face-landmarking-ios

@teresakozera
Copy link

@stanchiang- I will have a look at it on Monday, as today I'm heading for a little bit longer weekend. Anyway I hope you manage to solve this problem earlier. :) Have a nice weekend!

@morizotter
Copy link
Contributor

morizotter commented Aug 19, 2016

@hoangdado @stanchiang Thanks! I used your solution and almost all problems solved. Later, I found the better - just I think - way.

I made pull request: #9 . In this pr, I convert faceObject in SessionHandler for fitting the given orientation.

Even If connection's orientation is portrait, it works well.

How do you think??

@realmosn
Copy link

@teresakozera Could you suggest how to integrate attentionTracker into the project, I've tried for some time but still stuck and haven't got anywhere. almost close to pull my hairs out

@Miths19
Copy link

Miths19 commented May 6, 2017

i want to crop landmarked portion of the face...... i want only the face can any one help me for this

@trungnguyen1791
Copy link

trungnguyen1791 commented Aug 14, 2017

@stanchiang You could easily change "VideoMirrored" mode with this one instead of doing some manual transforms
if (connection.isVideoMirroringSupported) { connection.isVideoMirrored = true; }

@wangwenzhen
Copy link

@stanchiang I want to be able to support detection in both horizontal and vertical screens,can you provide sample demo ?

@liamwalsh
Copy link

bump
I'm still having this issue on the latest master - setting my AVCaptureConnection videoOrientation to portrait causes all of the feature points to be wrong.

@Hardy143
Copy link

Hardy143 commented Jul 5, 2018

@liamwalsh were you able to find a solution? I'm having the same problem as you.

@Hardy143
Copy link

Hardy143 commented Jul 5, 2018

@liamwalsh I found I was putting connection.videoOrientation = AVCaptureVideoOrientation.portrait in the wrong captureOutput function. It now works for me:

screen shot 2018-07-05 at 17 45 26

@jpatel956
Copy link

@stanchiang

First of thanks for your link.
https://github.com/stanchiang/face-landmarking-ios

I have succesfully run your code but not have one issue that you face earlier may be offset where my cgpoint coordinates have a weird offset for some reason.

Here I am getting error:

validateTextureDimensions, line 759: error 'MTLTextureDescriptor has width (114046) greater than the maximum allowed size of 8192.'
validateTextureDimensions:759: failed assertion `MTLTextureDescriptor has width (114046) greater than the maximum allowed size of 8192.'

can you please help me to come out.

Thanks

@ScientistMe
Copy link

Hi,
i'm dealing with this issue, and i'm not able to get it working in portrait mode. I've read all the threads here.
My guess is that in portrait mode i'm having a wrong proportions of the layer where the wrapper draws the points on it because the points looks distorted.
IMG-4956
IMG-4957

Can you please help me?

@wonmor
Copy link

wonmor commented Jul 20, 2024

I managed to fix the issue of this solution not working on the latest version of the code base; just 4 years after.
What a long time it took me solve this.
Jokes aside, it actually took a decent time to figure this out:

If you go through the instructions provided in #5
You'll be able to quickly figure out that there's no convertScaleCGRect function in the recent version of this code base.

That's because the author has pushed a "simplified version" of DlibWrapper class, so I had to go through the previous commit history and I found it (the one in May 2016).

First and foremost, replace the entirety of your DlibWrapper.mm file to the following:

//
//  DlibWrapper.m
//  DisplayLiveSamples
//
//  Created by Luis Reisewitz on 16.05.16.
//  Copyright © 2016 ZweiGraf. All rights reserved.
//

#import "DlibWrapper.h"
#import <UIKit/UIKit.h>

#include <dlib/image_processing.h>
#include <dlib/image_io.h>

@interface DlibWrapper ()

@property (assign) BOOL prepared;

+ (dlib::rectangle)convertScaleCGRect:(CGRect)rect toDlibRectacleWithImageSize:(CGSize)size;
+ (std::vector<dlib::rectangle>)convertCGRectValueArray:(NSArray<NSValue *> *)rects toVectorWithImageSize:(CGSize)size;

@end
@implementation DlibWrapper {
    dlib::shape_predictor sp;
}


-(instancetype)init {
    self = [super init];
    if (self) {
        _prepared = NO;
    }
    return self;
}

- (void)prepare {
    NSString *modelFileName = [[NSBundle mainBundle] pathForResource:@"shape_predictor_68_face_landmarks" ofType:@"dat"];
    std::string modelFileNameCString = [modelFileName UTF8String];
    
    dlib::deserialize(modelFileNameCString) >> sp;
    
    // FIXME: test this stuff for memory leaks (cpp object destruction)
    self.prepared = YES;
}

-(void)doWorkOnSampleBuffer:(CMSampleBufferRef)sampleBuffer inRects:(NSArray<NSValue *> *)rects {
    
    if (!self.prepared) {
        [self prepare];
    }
    
    dlib::array2d<dlib::bgr_pixel> img;
    
    // MARK: magic
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    CVPixelBufferLockBaseAddress(imageBuffer, kCVPixelBufferLock_ReadOnly);

    size_t width = CVPixelBufferGetWidth(imageBuffer);
    size_t height = CVPixelBufferGetHeight(imageBuffer);
    char *baseBuffer = (char *)CVPixelBufferGetBaseAddress(imageBuffer);
    
    // set_size expects rows, cols format
    img.set_size(height, width);
    
    // copy samplebuffer image data into dlib image format
    img.reset();
    long position = 0;
    while (img.move_next()) {
        dlib::bgr_pixel& pixel = img.element();

        // assuming bgra format here
        long bufferLocation = position * 4; //(row * width + column) * 4;
        char b = baseBuffer[bufferLocation];
        char g = baseBuffer[bufferLocation + 1];
        char r = baseBuffer[bufferLocation + 2];
        //        we do not need this
        //        char a = baseBuffer[bufferLocation + 3];
        
        dlib::bgr_pixel newpixel(b, g, r);
        pixel = newpixel;
        
        position++;
    }
    
    // unlock buffer again until we need it again
    CVPixelBufferUnlockBaseAddress(imageBuffer, kCVPixelBufferLock_ReadOnly);

    CGSize imageSize = CGSizeMake(width, height);
    
    // convert the face bounds list to dlib format
    std::vector<dlib::rectangle> convertedRectangles = [DlibWrapper convertCGRectValueArray:rects toVectorWithImageSize:imageSize];
    
    // for every detected face
    for (unsigned long j = 0; j < convertedRectangles.size(); ++j)
    {
        dlib::rectangle oneFaceRect = convertedRectangles[j];
        
        // detect all landmarks
        dlib::full_object_detection shape = sp(img, oneFaceRect);
        
        // and draw them into the image (samplebuffer)
        for (unsigned long k = 0; k < shape.num_parts(); k++) {
            dlib::point p = shape.part(k);
            draw_solid_circle(img, p, 3, dlib::rgb_pixel(0, 255, 255));
        }
    }
    
    // lets put everything back where it belongs
    CVPixelBufferLockBaseAddress(imageBuffer, 0);

    // copy dlib image data back into samplebuffer
    img.reset();
    position = 0;
    while (img.move_next()) {
        dlib::bgr_pixel& pixel = img.element();
        
        // assuming bgra format here
        long bufferLocation = position * 4; //(row * width + column) * 4;
        baseBuffer[bufferLocation] = pixel.blue;
        baseBuffer[bufferLocation + 1] = pixel.green;
        baseBuffer[bufferLocation + 2] = pixel.red;
        //        we do not need this
        //        char a = baseBuffer[bufferLocation + 3];
        
        position++;
    }
    CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
}

+ (dlib::rectangle)convertScaleCGRect:(CGRect)rect toDlibRectacleWithImageSize:(CGSize)size {
    long right = (1.0 - rect.origin.y ) * size.width;
    long left = right - rect.size.height * size.width;
    long top = rect.origin.x * size.height;
    long bottom = top + rect.size.width * size.height;
    
    dlib::rectangle dlibRect(left, top, right, bottom);
    return dlibRect;
}

+ (std::vector<dlib::rectangle>)convertCGRectValueArray:(NSArray<NSValue *> *)rects toVectorWithImageSize:(CGSize)size {
    std::vector<dlib::rectangle> myConvertedRects;
    for (NSValue *rectValue in rects) {
        CGRect singleRect = [rectValue CGRectValue];
        dlib::rectangle dlibRect = [DlibWrapper convertScaleCGRect:singleRect toDlibRectacleWithImageSize:size];
        myConvertedRects.push_back(dlibRect);
    }
    return myConvertedRects;
}

@end

I applied the changes made in the comments in issue #5 so that now it supports portrait mode.

Next change you need to make is...
Go to SessionHandler and locate the following:

 func captureOutput(_ output: AVCaptureOutput, didDrop sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
        print("DidDropSampleBuffer")
    }

BE AWARE! There are TWO captureOutput(s) — you must choose the one which has NO code inside of the function block except for a simple print line. Then you wanna add the following inside the function: connection.videoOrientation = AVCaptureVideoOrientation.portrait

So the final version of SessionHandler's captureOutput function will look like the following:

func captureOutput(_ output: AVCaptureOutput, didDrop sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
        print("DidDropSampleBuffer")
        connection.videoOrientation = AVCaptureVideoOrientation.portrait
    }

BOOM. All issues have been resolved. OH BY THE WAY, add session.sessionPreset = AVCaptureSession.Preset.vga640x480 RIGHT BEFORE the session.startRunning() line in ViewController.swift to enable legacy style video streaming (640x480 instead of 1024 something dimensions) so that there's LESS noise and instability in the landmark data. Lower resolution helps because it lowers the demand for the machine to handle.

Hope that helps, I know maybe I'm a little too late now that the Vision framework/ARKit framework is out, but in the case you're writing C++ code in tandem with Swift and want to import these stuff from C++ side too using Objective-C++ — this is a tutorial for you!

John Seong

@wonmor
Copy link

wonmor commented Jul 20, 2024

Never mind — you're supposed to add connection.videoOrientation = AVCaptureVideoOrientation.portrait to the OTHER captureOutput NOT the one I indicated above. Sorry.

@wonmor
Copy link

wonmor commented Jul 20, 2024

ANOTHER UPDATE - just replace the whole captureOutput to the following:

    // MARK: AVCaptureVideoDataOutputSampleBufferDelegate
    func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
        connection.videoOrientation = AVCaptureVideoOrientation.portrait
        
        if !currentMetadata.isEmpty {
            let boundsArray = currentMetadata
                .compactMap { $0 as? AVMetadataFaceObject }
                .map { NSValue(cgRect: $0.bounds) }
            
            wrapper?.doWork(on: sampleBuffer, inRects: boundsArray)
        }
        
        layer.enqueue(sampleBuffer)
    }

You also have to tinker with changing the .map part to .map { NSValue(cgRect: $0.bounds) }.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests