Dynamically loading cygwin1.dll to automatically find cygroot, etc #1

Closed
juntalis opened this Issue Jan 20, 2012 · 7 comments

Projects

None yet

2 participants

@juntalis

Hey,

I'm currently working on a win32 (vc) port of the ln utility that creates native ntfs hard links/symbolic links/etc. Anyways, one of the features I wrote in was that if it detected that it was run from cygwin/msys, and if the paths given were POSIX-style paths, it will LoadLibraryW(L"cygwin1.dll") (or in MSYS, msys-1.0.dll) and call the appropriate functions to accurately convert the path in the context of the current environment. Since the program is being run from cygwin, you can rightfully assume that cygwin1.dll will be on the system PATH, and thus be accessible from the program. That should eliminate the need for users to have to set the CYGROOT environment variable, or specify it on the command line.

If you're interested, I can clean up the code a bit and throw it in a pull request. I'm just not sure if you're manually converting the paths for the sake of performance, since dynamically loading the cygwin dll also requires you call cygwin_dll_init. (I can probably do a quick benchmark if that's the case)

@mturk
Owner
mturk commented Jan 20, 2012

Sure. Very much interested if its dynamic and not linked to cygwin1.dll
The reason I did manual conversion was because the project originates from more generic program supporting Microsoft SUA as well, but since this is abandoned now, there is no point to continue.
Also some parts cannot be easily converted by cygwin functions (but would love to see if you can make it)
By that I mean things that are not seen as path elements directly.
Typical ones are compiler parameters
eg. /Fo:/some/path does not look like something cygwin API can directly convert.

@juntalis

Hm, good point. I'll most likely play with a bit later this week when I have more time, and see if I can come up with an efficient solution. One idea is to import a few more functions from cygwin1.dll in addition to just the path conversion functions. Something like the opendir(), etc functions in dirent.h. That way, we could get a list of all files/folders currently in cygwin's root folder, and if the argument specified does not match any of the file list, it simply passes the argument along.

Haven't really written any C/C++ code for Linux in a while, though, so there might be an easier solution. I'll check into it.

@juntalis
juntalis commented Feb 2, 2012

Apparently I never submitted the post I wrote earlier in the week, so let me rewrite it now:

I had some free time earlier in the week, so I decided to go through and see if I could figure out the best way to address the command line options. While I haven't gone through every piece of the existing code, I've got a general feel, now, for the flow of the application. The solution I came up with is to use the cygwin runtime to resolve the root, then dynamically build the pathmatches array to reflect the children of cygwin's root folder.

/* Copyright (c) 2011 The MyoMake Project <http://www.myomake.org>
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 *
 */

#if !defined(UNICODE)
#define UNICODE
#endif
#if defined(_MSC_VER) && _MSC_VER >= 1200
#pragma warning(push, 3)
#endif

/*
 * disable or reduce the frequency of...
 *   C4057: indirection to slightly different base types
 *   C4075: slight indirection changes (unsigned short* vs short[])
 *   C4100: unreferenced formal parameter
 *   C4127: conditional expression is constant
 *   C4163: '_rotl64' : not available as an intrinsic function
 *   C4201: nonstandard extension nameless struct/unions
 *   C4244: int to char/short - precision loss
 *   C4514: unreferenced inline function removed
 */
#pragma warning(disable: 4100 4127 4163 4201 4514; once: 4057 4075 4244)

/*
 * Ignore Microsoft's interpretation of secure development
 * and the POSIX string handling API
 */
#if defined(_MSC_VER) && _MSC_VER >= 1400
#define _CRT_SECURE_NO_DEPRECATE
#endif
#pragma warning(disable: 4996)

#define WIN32_LEAN_AND_MEAN
#ifndef _WIN32_WINNT
#define _WIN32_WINNT 0x0502
#endif
#include <windows.h>
#include <tlhelp32.h>
#include <psapi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <errno.h>
#include <process.h>
#include <io.h>

#define XPATH_MAX 8192

static int debug = 0;
static const char aslicense[] = ""                                          \
    "Licensed under the Apache License, Version 2.0 (the ""License"");\n"   \
    "you may not use this file except in compliance with the License.\n"    \
    "You may obtain a copy of the License at\n\n"                           \
    "http://www.apache.org/licenses/LICENSE-2.0\n";

static const wchar_t *cyglibrary = L"cygwin1.dll";
static const wchar_t *cygroot = 0;
static wchar_t  windrive[] = { 0, L':', L'\\', 0};

static const wchar_t **pathmatches = NULL;

/**
 * Some common options for Microsoft compiler and linker
 * which start with slash.
 * If we found one of these we won't treat the option
 * as path element. Eg, /I will be option /Ia will be path.
 */
static const wchar_t *optmatch[] = {
    0,
    L"I",
    L"Fa",
    L"Fd",
    L"Fe",
    L"FI",
    L"Fl",
    L"Fm",
    L"Fo",
    L"Fr",
    L"FR",
    L"Tc",
    L"Tp",
    0, /* Separator for case sensitivity. Rest are case insensitive */
    L"BASE:@",
    L"IDLOUT:",
    L"IMPLIB:",
    L"KEYFILE:",
    L"LIBPATH:",
    L"MANIFESTFILE:",
    L"MAP:",
    L"OUTPUTRESOURCE:",
    L"OUT:",
    L"PGD:",
    L"PDB:",
    L"PDBSTRIPPED:",
    L"TLBOUT:",
    0
};

#ifdef _WIN64
typedef __int64 ssize_t;
#else
typedef _W64 int ssize_t;
#endif

typedef void (*cygwin_dll_init_fn)();

/* Possible 'what' values in calls to cygwin_conv_path/cygwin_create_path. */
enum
{
  CCP_POSIX_TO_WIN_A = 0, /* from is char*, to is char*       */
  CCP_POSIX_TO_WIN_W,     /* from is char*, to is wchar_t*    */
  CCP_WIN_A_TO_POSIX,     /* from is char*, to is char*       */
  CCP_WIN_W_TO_POSIX,     /* from is wchar_t*, to is char*    */
  /* Or these values to the above as needed. */
  CCP_ABSOLUTE = 0,   /* Request absolute path (default). */
  CCP_RELATIVE = 0x100    /* Request to keep path relative.   */
};
typedef unsigned int cygwin_conv_path_t;
/* If size is 0, cygwin_conv_path returns the required buffer size in bytes.
   Otherwise, it returns 0 on success, or -1 on error and errno is set to
   one of the below values:

    EINVAL        what has an invalid value.
    EFAULT        from or to point into nirvana.
    ENAMETOOLONG  the resulting path is longer than 32K, or, in case
          of what == CCP_POSIX_TO_WIN_A, longer than MAX_PATH.
    ENOSPC        size is less than required for the conversion.
*/
typedef ssize_t (*cygwin_conv_path_fn)(cygwin_conv_path_t what, const void *from, void *to, size_t size);
typedef ssize_t (*cygwin_conv_path_list_fn)(cygwin_conv_path_t what, const void *from, void *to, size_t size);

/* Allocate a buffer for the conversion result using malloc(3), and return
   a pointer to it.  Returns NULL if something goes wrong with errno set
   to one of the above values, or to ENOMEM if malloc fails. */
typedef void* (*cygwin_create_path_fn)(cygwin_conv_path_t what, const void *from);
typedef int (*cygwin_posix_path_list_p_fn) (const char *);
typedef void (*cygwin_split_path_fn) (const char *, char *, char *);


/**
 * Maloc that causes process exit in case of ENOMEM
 */
static void *xmalloc(size_t size)
{
    void *p = calloc(size, 1);
    if (p == 0) {
        _wperror(L"malloc");
        _exit(1);
    }
    return p;
}

static wchar_t **waalloc(size_t size)
{
    return (wchar_t **)xmalloc((size + 1) * sizeof(wchar_t *));
}

static __inline void xfree(void *m)
{
    if (m != 0)
        free(m);
}

static void wafree(wchar_t **array)
{
    wchar_t **ptr = array;

    if (array == 0)
        return;
    while (*ptr != 0)
        xfree(*(ptr++));
    xfree(array);
}

static BOOL load_cygwin_library(HMODULE* hCygwin)
{
    if((*hCygwin = GetModuleHandleW(cyglibrary)) == NULL)
        if((*hCygwin = LoadLibraryW(cyglibrary)) == NULL)
            return FALSE;
    return TRUE;
}

static BOOL init_cygwin_library(HMODULE hCygwin)
{
    cygwin_dll_init_fn cygwin_dll_init = NULL;
    if((cygwin_dll_init = (cygwin_dll_init_fn)GetProcAddress(hCygwin,"cygwin_dll_init")) == NULL) {
        FreeLibrary(hCygwin);
        return FALSE;
    }
    cygwin_dll_init();
    return TRUE;
}

static BOOL is_folder(const wchar_t* dirname)
{
    return GetFileAttributesW(dirname) & FILE_ATTRIBUTE_DIRECTORY;
}

static BOOL resolve_cygroot(HMODULE hCygwin)
{
    // Our root folder. Needs to be a char array, rather than a wchar_t array.
    const void* posixPath = "/";

    // Our functions
    cygwin_create_path_fn cygwin_create_path = NULL;
    cygwin_conv_path_fn cygwin_conv_path = NULL;

    // Assign the functions we'll be using to their exports.
    cygwin_conv_path = (cygwin_conv_path_fn)GetProcAddress(hCygwin, "cygwin_conv_path");
    cygwin_create_path = (cygwin_create_path_fn)GetProcAddress(hCygwin, "cygwin_create_path");

    // Verify that our functions are valid exports.
    if((cygwin_conv_path == NULL) || (cygwin_create_path == NULL)) {
        FreeLibrary(hCygwin);
        return FALSE;
    }

    // Use built-in function to allocate our root path buffer.
    cygroot = (const wchar_t*)cygwin_create_path(CCP_POSIX_TO_WIN_W, posixPath);
    if(cygroot == NULL) {
        FreeLibrary(hCygwin);
        return FALSE;
    }

    // Since cygwin allocated our buffer, we can assume it's big enough to hold the output.
    if(cygwin_conv_path(CCP_POSIX_TO_WIN_W, posixPath, (void*)cygroot, _MAX_PATH) == -1) {
        FreeLibrary(hCygwin);
        return FALSE;
    }

    // Verify that the cygroot buffer is good.
    if(*cygroot == L'\0') {
        FreeLibrary(hCygwin);
        return FALSE;
    }

    // And finally, verify that our folder exists. Inverted to return 0 on success.
    return is_folder(cygroot);
}

static BOOL enumerate_cygroot()
{
    WIN32_FIND_DATAW dataFind;
    HANDLE hFind = INVALID_HANDLE_VALUE;
    wchar_t* findPattern = NULL;
    size_t patternLen;
    size_t rootLen;
    DWORD dwError;
    size_t childItems = 0;
    int i = 0;

    // Get cygroot length for later use.
    rootLen = lstrlenW(cygroot);

    // len(cygroot) + len("\\*") + \0
    patternLen = (rootLen + 3) * sizeof(wchar_t);

    // allocate our buffer. fatal error if we can't.
    if((findPattern = (wchar_t*)malloc(patternLen)) == NULL) {
        _wperror(L"malloc");
        _exit(EXIT_FAILURE);
    }

    // zero out our pattern buffer.
    memset((void*)findPattern, 0, patternLen);
    wcsncpy(findPattern, cygroot, rootLen);
    wcsncat(findPattern, L"\\*", 3);

    // Open find handle and make sure it's a valid handle.
    if((hFind = FindFirstFileW(findPattern, &dataFind)) == INVALID_HANDLE_VALUE) {
        free(findPattern);
        _wperror(L"Cannot enumerate cygroot folder.");
        _exit(EXIT_FAILURE);
    }

    do {
        // Don't include . or ..
        if((dataFind.cFileName[0] == L'.') && (dataFind.cFileName[1] == L'\0')) continue;
        if((dataFind.cFileName[0] == L'.') && (dataFind.cFileName[1] == L'.') && (dataFind.cFileName[2] == L'\0')) continue;

        // Add to our child count.
        childItems++;
    } while(FindNextFileW(hFind, &dataFind) != 0);

    // Get last error and close our find handle.
    dwError = GetLastError();
    FindClose(hFind);

    // We need to check to make sure the error specified is ERROR_NO_MORE_FILES
    if (dwError != ERROR_NO_MORE_FILES) {
        free(findPattern);
        wprintf(L"FindNextFile error. Error is %u\n", dwError);
        _exit(EXIT_FAILURE);
    }

    // Allocate our pattern list. +1 = "/proc/*" (which isn't a folder)
    pathmatches = (const wchar_t**)waalloc(childItems+1);

    //memset((void*)pathmatches, 0, (childItems+1) * sizeof(wchar_t*));
    // Open find handle and make sure it's a valid handle.
    if((hFind = FindFirstFileW(findPattern, &dataFind)) == INVALID_HANDLE_VALUE) {
        free(findPattern);

        wprintf(L"Cannot enumerate cygroot folder.. again.\n");
        _exit(EXIT_FAILURE);
    }

    do {
        // Don't include . or ..
        if((dataFind.cFileName[0] == L'.') && (dataFind.cFileName[1] == L'\0')) continue;
        if((dataFind.cFileName[0] == L'.') && (dataFind.cFileName[1] == L'.') && (dataFind.cFileName[2] == L'\0')) continue;

        // And now we can add an item to our pattern list.
        if(dataFind.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) {
            // len("/") + len(dataFind.cFileName) + len("/*") + \0
            patternLen = (lstrlenW(dataFind.cFileName) + 4) * sizeof(wchar_t);
            if((pathmatches[i] = (wchar_t*)malloc(patternLen)) == NULL) {
                wprintf(L"malloc");
                _exit(EXIT_FAILURE);
            }
            memset((void*)pathmatches[i], 0, patternLen);
            swprintf((wchar_t*)pathmatches[i], patternLen, L"/%s/*", dataFind.cFileName);
        } else {
            // len("/") + len(dataFind.cFileName) + \0
            patternLen = (lstrlenW(dataFind.cFileName) + 2) * sizeof(wchar_t);
            if((pathmatches[i] = (wchar_t*)malloc(patternLen)) == NULL) {
                wprintf(L"malloc");
                _exit(EXIT_FAILURE);
            }
            memset((void*)pathmatches[i], 0, patternLen);
            swprintf((wchar_t*)pathmatches[i], patternLen, L"/%s", dataFind.cFileName);
        }
        i++;
    } while(FindNextFileW(hFind, &dataFind) != 0);

    // We need to check to make sure the error specified is ERROR_NO_MORE_FILES
    dwError = GetLastError();
    FindClose(hFind);
    free(findPattern);

    if (dwError != ERROR_NO_MORE_FILES) {
        wprintf(L"FindNextFile error. Error is %u\n", dwError);
        _exit(EXIT_FAILURE);
    }

    // len("/proc/*") + \0
    patternLen = 8 * sizeof(wchar_t);
    if((pathmatches[i] = (wchar_t*)malloc(patternLen)) == NULL) {
        wprintf(L"malloc");
        _exit(EXIT_FAILURE);
    }
    memset((void*)pathmatches[i], 0, patternLen);
    wcsncpy((wchar_t*)pathmatches[i], L"/proc/*", 8);
    i++;
    pathmatches[i] = NULL;
    return TRUE;
}

static BOOL setup_root()
{
    HMODULE hCygwin = NULL;

    // Load the cygwin dll into our application.
    if(!load_cygwin_library(&hCygwin))
        return FALSE;

    // Init the cygwin environment. (Required)
    if(!init_cygwin_library(hCygwin))
        return FALSE;

    // Use exported cygwin functions to resolve our root.
    if(!resolve_cygroot(hCygwin))
        return FALSE;

    // We wont be using cygwin's dll anymore, so we can unload it.
    FreeLibrary(hCygwin);

    if(!enumerate_cygroot())
        return FALSE;

    wprintf(L"Root: %s\n", cygroot);
    return TRUE;
}


int wmain(int argc, const wchar_t **wargv)
{
    int i = -1;
    setup_root();
    while(pathmatches[i++] != NULL) {
        wprintf(L"Pattern: %s\n", pathmatches[i]);
    }
    return 0;
}

If you compile the code above, then run it in cygwin, it will list the new pathmatches array. I still need to add a function to free the memory allocated for all the patterns and for cygroot, but if you're cool with using this solution, I'll go ahead and start working it into the existing application. My idea was to call setup_root() at the start of the program (and finalize_root() at the end), and then continue on to the existing parsing code, but let me know if you have a better idea.

Thanks.

@mturk
Owner
mturk commented Feb 3, 2012

On 02/02/2012 10:26 PM, Charles Grunwald wrote:

Apparently I never submitted the post I wrote earlier in the week, so let me rewrite it now:

OK.

If you compile the code above, then run it in cygwin, it will list the new pathmatches array. I still need to add a function to free the memory allocated for all the patterns and for cygroot, but if you're cool with using this solution, I'll go ahead and start working it into the existing application. My idea was to call setup_root() at the start of the program (and finalize_root() at the end), and then continue on to the existing parsing code, but let me know if you have a better idea.

Could you fork and create a pull request?
I coud review the changes then more easily.

Cheers

^TM

@juntalis juntalis pushed a commit to juntalis/cygspawn that referenced this issue Feb 3, 2012
Charles Grunwald (Juntalis) Issue #1: mturk#1 4a795f5
@juntalis
juntalis commented Feb 4, 2012

See pull request

@juntalis juntalis closed this Feb 4, 2012
@mturk
Owner
mturk commented Feb 5, 2012

Did some code cleanup (K&R style) and few minor fixes.
For example there is xmalloc function that calls _exit on failure so I replace your malloc/check-for-null/exit with that single call.

One question. Why have you choose to enumerate files from cygwin root?
I removed that, but would like to hear the reason if I miss something obvious.

Also, the binary now doesn't work as 64-bit, but that's minor issue.
Cygwin is 32-bit so it'll always fail to load the cygwin-1.dll.

@juntalis
juntalis commented Feb 5, 2012

My thought process on the enumeration of cygroot was more for situations where the user has a non-standard file/folder living at the cygwin root folder. In those situations, the folder can still be accessed from cygwin as a root-relative folder. For instance:

I currently have a folder named "nix" at C:\ShellEnv (my cygroot) as I attempt to get this port of Nix working on Cygwin. Were I to specify the nix file in that folder from command line, I could access it from /nix/var/nix/gcroots/profiles.

Not really a major issue, since the folder could also be accessed from /cygdrive/c/ShellEnv/nix, but I figured I'd throw it in for kicks.

I didn't really consider the repercussions associated with x64. One possible option would be to check if the cygwin DLL can be loaded (or if the current system is x64), and if not, attempt to resolve the cygroot through the previous methods.

Sorry 'bout the code style and any functions I may've overlooked. As I said, my C/C++ is a bit rusty, but let me know if there's anything I can help with. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment